Package org.apache.druid.query.groupby
Class GroupingEngine
- java.lang.Object
-
- org.apache.druid.query.groupby.GroupingEngine
-
public class GroupingEngine extends Object
Common code for processingGroupByQuery.
-
-
Field Summary
Fields Modifier and Type Field Description static StringCTX_KEY_FUDGE_TIMESTAMPstatic StringCTX_KEY_OUTERMOST
-
Constructor Summary
Constructors Constructor Description GroupingEngine(DruidProcessingConfig processingConfig, com.google.common.base.Supplier<GroupByQueryConfig> configSupplier, NonBlockingPool<ByteBuffer> bufferPool, GroupByResourcesReservationPool groupByResourcesReservationPool, com.fasterxml.jackson.databind.ObjectMapper jsonMapper, com.fasterxml.jackson.databind.ObjectMapper spillMapper, QueryWatcher queryWatcher)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Sequence<ResultRow>applyPostProcessing(Sequence<ResultRow> results, GroupByQuery query)Apply theGroupByQuery"postProcessingFn", which is responsible for HavingSpec and LimitSpec.static voidconvertRowTypesToOutputTypes(List<DimensionSpec> dimensionSpecs, ResultRow resultRow, int resultRowDimensionStart)BinaryOperator<ResultRow>createMergeFn(Query<ResultRow> queryParam)SeeQueryToolChest.createMergeFn(Query)for details, allowsGroupByQueryQueryToolChestto delegate implementation to the strategyComparator<ResultRow>createResultComparator(Query<ResultRow> queryParam)SeeQueryToolChest.createResultComparator(Query), allowsGroupByQueryQueryToolChestto delegate implementation to the strategystatic intgetCardinalityForArrayAggregation(GroupByQueryConfig querySpecificConfig, GroupByQuery query, StorageAdapter storageAdapter, ByteBuffer buffer)Returns the cardinality of array needed to do array-based aggregation, or -1 if array-based aggregation is impossible.Sequence<ResultRow>mergeResults(QueryRunner<ResultRow> baseRunner, GroupByQuery query, ResponseContext responseContext)Runs a providedQueryRunneron a providedGroupByQuery, which is assumed to return rows that are properly sorted (by timestamp and dimensions) but not necessarily fully merged (that is, there may be adjacent rows with the same timestamp and dimensions) and without PostAggregators computed.QueryRunner<ResultRow>mergeRunners(QueryProcessingPool queryProcessingPool, Iterable<QueryRunner<ResultRow>> queryRunners)Merges a variety of single-segment query runners into a combined runner.GroupByQueryprepareGroupByQuery(GroupByQuery query)static GroupByQueryResourcesprepareResource(GroupByQuery query, BlockingPool<ByteBuffer> mergeBufferPool, boolean usesGroupByMergingQueryRunner, GroupByQueryConfig groupByQueryConfig)Initializes resources required to runGroupByQueryQueryToolChest.mergeResults(QueryRunner)andGroupByMergingQueryRunnerfor a particular query.Sequence<ResultRow>process(GroupByQuery query, StorageAdapter storageAdapter, GroupByQueryMetrics groupByQueryMetrics)Process a groupBy query on a singleStorageAdapter.Sequence<ResultRow>processSubqueryResult(GroupByQuery subquery, GroupByQuery query, GroupByQueryResources resource, Sequence<ResultRow> subqueryResult, boolean wasQueryPushedDown)Called byGroupByQueryQueryToolChest.mergeResults(QueryRunner)when it needs to process a subquery.Sequence<ResultRow>processSubtotalsSpec(GroupByQuery query, GroupByQueryResources resource, Sequence<ResultRow> queryResult)Called byGroupByQueryQueryToolChest.mergeResults(QueryRunner)when it needs to generate subtotals.static Sequence<ResultRow>wrapSummaryRowIfNeeded(GroupByQuery query, Sequence<ResultRow> process)Wraps the sequence around if for this query a summary row might be needed in case the input becomes empty.
-
-
-
Field Detail
-
CTX_KEY_FUDGE_TIMESTAMP
public static final String CTX_KEY_FUDGE_TIMESTAMP
- See Also:
- Constant Field Values
-
CTX_KEY_OUTERMOST
public static final String CTX_KEY_OUTERMOST
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
GroupingEngine
@Inject public GroupingEngine(DruidProcessingConfig processingConfig, com.google.common.base.Supplier<GroupByQueryConfig> configSupplier, NonBlockingPool<ByteBuffer> bufferPool, GroupByResourcesReservationPool groupByResourcesReservationPool, com.fasterxml.jackson.databind.ObjectMapper jsonMapper, com.fasterxml.jackson.databind.ObjectMapper spillMapper, QueryWatcher queryWatcher)
-
-
Method Detail
-
prepareResource
public static GroupByQueryResources prepareResource(GroupByQuery query, BlockingPool<ByteBuffer> mergeBufferPool, boolean usesGroupByMergingQueryRunner, GroupByQueryConfig groupByQueryConfig)
Initializes resources required to runGroupByQueryQueryToolChest.mergeResults(QueryRunner)andGroupByMergingQueryRunnerfor a particular query. The resources are to be acquired once throughout the execution of the query, or need to be re-acquired (if needed). Users must ensure that throughout the execution, a query already holding the resources shouldn't request for more resources, because that can cause deadlocks.This method throws an exception if it is not able to allocate sufficient resources required for the query to succeed
-
createResultComparator
public Comparator<ResultRow> createResultComparator(Query<ResultRow> queryParam)
SeeQueryToolChest.createResultComparator(Query), allowsGroupByQueryQueryToolChestto delegate implementation to the strategy
-
createMergeFn
public BinaryOperator<ResultRow> createMergeFn(Query<ResultRow> queryParam)
SeeQueryToolChest.createMergeFn(Query)for details, allowsGroupByQueryQueryToolChestto delegate implementation to the strategy
-
prepareGroupByQuery
public GroupByQuery prepareGroupByQuery(GroupByQuery query)
-
mergeResults
public Sequence<ResultRow> mergeResults(QueryRunner<ResultRow> baseRunner, GroupByQuery query, ResponseContext responseContext)
Runs a providedQueryRunneron a providedGroupByQuery, which is assumed to return rows that are properly sorted (by timestamp and dimensions) but not necessarily fully merged (that is, there may be adjacent rows with the same timestamp and dimensions) and without PostAggregators computed. This method will fully merge the rows, apply PostAggregators, and return the resultingSequence. The query will be modified usingprepareGroupByQuery(GroupByQuery)before passing it down to the base runner. For example, "having" clauses will be removed and various context parameters will be adjusted. Despite the similar name, this method is much reduced in scope compared toGroupByQueryQueryToolChest.mergeResults(QueryRunner). That method does delegate to this one at some points, but has a truckload of other responsibility, including computing outer query results (if there are subqueries), computing subtotals (like GROUPING SETS), and computing the havingSpec and limitSpec.- Parameters:
baseRunner- base query runnerquery- the groupBy query to run inside the base query runnerresponseContext- the response context to pass to the base query runner- Returns:
- merged result sequence
-
mergeRunners
public QueryRunner<ResultRow> mergeRunners(QueryProcessingPool queryProcessingPool, Iterable<QueryRunner<ResultRow>> queryRunners)
Merges a variety of single-segment query runners into a combined runner. Used byGroupByQueryRunnerFactory.mergeRunners(QueryProcessingPool, Iterable). In that sense, it is intended to go along withprocess(GroupByQuery, StorageAdapter, GroupByQueryMetrics)(the runners created by that method will be fed into this method). This is primarily called on the data servers, to merge the results from processing on the segments. This method can also be called on the brokers if the query is operating on the local data sources, like the inline datasources. It usesGroupByMergingQueryRunnerwhich requires the merge buffers to be passed in the responseContext of the query that is run.- Parameters:
queryProcessingPool-QueryProcessingPoolservice used for parallel execution of the query runnersqueryRunners- collection of query runners to merge- Returns:
- merged query runner
-
process
public Sequence<ResultRow> process(GroupByQuery query, StorageAdapter storageAdapter, @Nullable GroupByQueryMetrics groupByQueryMetrics)
Process a groupBy query on a singleStorageAdapter. This is used byGroupByQueryRunnerFactory.createRunner(org.apache.druid.segment.Segment)to create per-segment QueryRunners. This method is only called on data servers, like Historicals (not the Broker).- Parameters:
query- the groupBy querystorageAdapter- storage adatper for the segment in question- Returns:
- result sequence for the storage adapter
-
applyPostProcessing
public Sequence<ResultRow> applyPostProcessing(Sequence<ResultRow> results, GroupByQuery query)
Apply theGroupByQuery"postProcessingFn", which is responsible for HavingSpec and LimitSpec.- Parameters:
results- sequence of resultsquery- the groupBy query- Returns:
- post-processed results, with HavingSpec and LimitSpec applied
-
processSubqueryResult
public Sequence<ResultRow> processSubqueryResult(GroupByQuery subquery, GroupByQuery query, GroupByQueryResources resource, Sequence<ResultRow> subqueryResult, boolean wasQueryPushedDown)
Called byGroupByQueryQueryToolChest.mergeResults(QueryRunner)when it needs to process a subquery.- Parameters:
subquery- inner queryquery- outer queryresource- resources returned byprepareResource(GroupByQuery, BlockingPool, boolean, GroupByQueryConfig)subqueryResult- result rows from the subquerywasQueryPushedDown- true if the outer query was pushed down (so we only need to merge the outer query's results, not run it from scratch like a normal outer query)- Returns:
- results of the outer query
-
processSubtotalsSpec
public Sequence<ResultRow> processSubtotalsSpec(GroupByQuery query, GroupByQueryResources resource, Sequence<ResultRow> queryResult)
Called byGroupByQueryQueryToolChest.mergeResults(QueryRunner)when it needs to generate subtotals.- Parameters:
query- query that has a "subtotalsSpec"resource- resources returned byprepareResource(GroupByQuery, BlockingPool, boolean, GroupByQueryConfig)queryResult- result rows from the main query- Returns:
- results for each list of subtotals in the query, concatenated together
-
getCardinalityForArrayAggregation
public static int getCardinalityForArrayAggregation(GroupByQueryConfig querySpecificConfig, GroupByQuery query, StorageAdapter storageAdapter, ByteBuffer buffer)
Returns the cardinality of array needed to do array-based aggregation, or -1 if array-based aggregation is impossible.
-
convertRowTypesToOutputTypes
public static void convertRowTypesToOutputTypes(List<DimensionSpec> dimensionSpecs, ResultRow resultRow, int resultRowDimensionStart)
-
wrapSummaryRowIfNeeded
public static Sequence<ResultRow> wrapSummaryRowIfNeeded(GroupByQuery query, Sequence<ResultRow> process)
Wraps the sequence around if for this query a summary row might be needed in case the input becomes empty.
-
-