-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analyzer: Optimize resolution of in(LowCardinality, ConstantSet) #64060
base: master
Are you sure you want to change the base?
Conversation
This is an automated comment for commit 8440017 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
428f851
to
8440017
Compare
When the FunctionIn applies to a LowCardinality and a constant set, its return type is expected to be resolved as LowCardinality also so that its argument of LowCardinality column would not be converted to a full one and much computation cost for iterating the rows in DB::Set::executeImplCase could be saved during the execution phase. This condition is fulfilled when FunctionNode::getArgumentColumns returns a LowCardinality column for FunctionIn's 1st argument,and a const column for the other. However, it's actually unfulfilled as a null column is returned for the 2nd argument instead in the Analyzer. This commit revised FunctionNode::getArgumentColumns to return a ColumnConst(ColumnSet) in such cases in order to turn on the opti- mization of LowCardinality. A significant performance gain of 1.39x is observed in query 3.3 of Star Schema Benchmark on the Intel ICX server with 160 vcpus.
For `k::LowCardinality(UInt8)`, the resolution of `k IN (1, NULL)` results in type LowCardinality(UInt8). This commit converts the return type to LowCardinality(Nullable(UInt8)).
The reported issues (timeout, binary_tidy, etc.) seem not related to this code change. I think this PR is ready for your review. Thanks in advance for your help! |
Hi Nikita, thanks for your interest into this work! Could you kindly review this PR also? Thanks again for your help! |
{ | ||
/// Created but not filled for the analysis during function resolution. | ||
FutureSetPtr empty_set; | ||
argument_column.column = ColumnConst::create(ColumnSet::create(1, empty_set), 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please point out where is it used during analysis? Can't it be constant-folded ? If yes, shouldn't it contain real value instead of empty set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second argument of FunctionIn
is required to be a ColumnConst
to resolve the return type as a LowCardinality
column (IFunctionOverloadResolver::getReturnType), and the ColumnSet
, though not necessary in the function resolution, is used here as a placeholder to denote its data type. And it's implemented as an empty set as its value is found not used in the later phases: after the build of the query tree, the CollectSetsVisitor
would collect the sets in the query to execute IN
function and place them in the planner_context.prepared_sets
, through which the ColumnSet
is created and filled as the argument of the ActionDAG::Node
in PlannerActionsVisitorImpl::makeSetForInFunction
.
And I think containing the real value may not aid the constant-folding optimization, as unlike 1 IN 1
, where both arguments are constant, this pattern has one column argument, which prevent the result of the FuncionIn
being determined at the query's compile time. We also implemented one POC of filling the FutureSet
with the real value in current stage, but found no difference in the query plan and even observed slight performance regressions.
But I'm not sure if there are alternative ways for achieving the constant-fold or the POC follows the suggested approach. Could you further share with me your insights? Thanks!
When the
FunctionIn
applies to aLowCardinality
and a constant set, its return type is expected to be resolved asLowCardinality
also so that its argument ofLowCardinality
column would not be converted to a full one and much computation cost for iterating the rows inDB::Set::executeImplCase
could be saved during the execution phase.This condition is fulfilled when
FunctionNode::getArgumentColumns
returns aLowCardinality
column forFunctionIn
's 1st argument, and aColumnConst
for the other. However, it's actually unfulfilled as a null column is returned for the 2nd argument instead in the Analyzer.This commit revised
FunctionNode::getArgumentColumns
to return aColumnConst(ColumnSet)
in such cases in order to turn on the optimization ofLowCardinality
. A significant performance gain of 1.39x is observed in query 3.3 of Star Schema Benchmark on the Intel ICX server with 160 vcpus.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
The queries of 3.3, 3.4, 4.1 and 4.2 from Star Schema Benchmark share a common pattern of predicate:
col = a or col = b
wherecol
is aLowCardinality(String)
. And during syntax optimization, this predicate is then transformed equivalently tocol in (a, b)
, which is executed byFunctionIn
withcol
and a constant tuple(a, b)
as its arguments.The return type of
in(LowCardinality, ConstantSet)
determines its execution flow: if it is non-LowCardinality
,FunctionIn
'sLowCardinality
argument would be converted to a full column and the Set operations on the full column would be much more expensive. And this optimization works by rectifying this function's return type.The performance experiments of SSB on the ICX device (Intel Xeon Platinum 8380 CPU, 80 cores, 160 threads) show that this change could bring the improvements of 39.3%, 2.8%, 2.6% and 1.2% to the QPS of the query 3.3, 3.4, 4,1, and 4.2 respectively while having no impact on others.
Modify your CI run
NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Checked options will be applied if set before CI RunConfig/PrepareRunConfig step
Include tests (required builds will be added automatically):
Exclude tests:
Extra options:
Only specified batches in multi-batch jobs: