Class UnifiedIndexerAppenderatorsManager

  • All Implemented Interfaces:
    AppenderatorsManager

    public class UnifiedIndexerAppenderatorsManager
    extends Object
    implements AppenderatorsManager
    Manages Appenderator instances for the CliIndexer task execution service, which runs all tasks in a single process. This class keeps a map of UnifiedIndexerAppenderatorsManager.DatasourceBundle objects, keyed by datasource name. Each bundle contains: - A per-datasource SinkQuerySegmentWalker (with an associated per-datasource timeline) - A map that associates a taskId with a list of Appenderators created for that task Access to the datasource bundle map and the task->appenderator maps is synchronized. The methods on this class can be called concurrently from multiple task threads. If there are no remaining appenderators for a given datasource, the corresponding bundle will be removed from the bundle map. Appenderators created by this class will use the shared per-datasource SinkQuerySegmentWalkers. The per-datasource SinkQuerySegmentWalkers share a common queryExecutorService. Each task that requests an Appenderator from this AppenderatorsManager will receive a heap memory limit equal to WorkerConfig.globalIngestionHeapLimitBytes evenly divided by WorkerConfig.capacity. This assumes that each task will only ingest to one Appenderator simultaneously. The Appenderators created by this class share an executor pool for IndexMerger persist and merge operations, with concurrent operations limited to `druid.worker.capacity` divided 2. This limit is imposed to reduce overall memory usage.
    • Constructor Detail

      • UnifiedIndexerAppenderatorsManager

        @Inject
        public UnifiedIndexerAppenderatorsManager​(org.apache.druid.query.QueryProcessingPool queryProcessingPool,
                                                  org.apache.druid.segment.join.JoinableFactoryWrapper joinableFactoryWrapper,
                                                  WorkerConfig workerConfig,
                                                  Cache cache,
                                                  CacheConfig cacheConfig,
                                                  CachePopulatorStats cachePopulatorStats,
                                                  com.fasterxml.jackson.databind.ObjectMapper objectMapper,
                                                  org.apache.druid.java.util.emitter.service.ServiceEmitter serviceEmitter,
                                                  com.google.inject.Provider<org.apache.druid.query.QueryRunnerFactoryConglomerate> queryRunnerFactoryConglomerateProvider)
    • Method Detail

      • createRealtimeAppenderatorForTask

        public Appenderator createRealtimeAppenderatorForTask​(SegmentLoaderConfig segmentLoaderConfig,
                                                              String taskId,
                                                              DataSchema schema,
                                                              AppenderatorConfig config,
                                                              FireDepartmentMetrics metrics,
                                                              org.apache.druid.segment.loading.DataSegmentPusher dataSegmentPusher,
                                                              com.fasterxml.jackson.databind.ObjectMapper objectMapper,
                                                              org.apache.druid.segment.IndexIO indexIO,
                                                              org.apache.druid.segment.IndexMerger indexMerger,
                                                              org.apache.druid.query.QueryRunnerFactoryConglomerate conglomerate,
                                                              DataSegmentAnnouncer segmentAnnouncer,
                                                              org.apache.druid.java.util.emitter.service.ServiceEmitter emitter,
                                                              org.apache.druid.query.QueryProcessingPool queryProcessingPool,
                                                              org.apache.druid.segment.join.JoinableFactory joinableFactory,
                                                              Cache cache,
                                                              CacheConfig cacheConfig,
                                                              CachePopulatorStats cachePopulatorStats,
                                                              org.apache.druid.segment.incremental.RowIngestionMeters rowIngestionMeters,
                                                              org.apache.druid.segment.incremental.ParseExceptionHandler parseExceptionHandler,
                                                              boolean useMaxMemoryEstimates,
                                                              CentralizedDatasourceSchemaConfig centralizedDatasourceSchemaConfig)
        Description copied from interface: AppenderatorsManager
        Creates an Appenderator suited for realtime ingestion. Note that this method's parameters include objects used for query processing.
        Specified by:
        createRealtimeAppenderatorForTask in interface AppenderatorsManager
      • createOfflineAppenderatorForTask

        public Appenderator createOfflineAppenderatorForTask​(String taskId,
                                                             DataSchema schema,
                                                             AppenderatorConfig config,
                                                             FireDepartmentMetrics metrics,
                                                             org.apache.druid.segment.loading.DataSegmentPusher dataSegmentPusher,
                                                             com.fasterxml.jackson.databind.ObjectMapper objectMapper,
                                                             org.apache.druid.segment.IndexIO indexIO,
                                                             org.apache.druid.segment.IndexMerger indexMerger,
                                                             org.apache.druid.segment.incremental.RowIngestionMeters rowIngestionMeters,
                                                             org.apache.druid.segment.incremental.ParseExceptionHandler parseExceptionHandler,
                                                             boolean useMaxMemoryEstimates)
        Specified by:
        createOfflineAppenderatorForTask in interface AppenderatorsManager
      • createOpenSegmentsOfflineAppenderatorForTask

        public Appenderator createOpenSegmentsOfflineAppenderatorForTask​(String taskId,
                                                                         DataSchema schema,
                                                                         AppenderatorConfig config,
                                                                         FireDepartmentMetrics metrics,
                                                                         org.apache.druid.segment.loading.DataSegmentPusher dataSegmentPusher,
                                                                         com.fasterxml.jackson.databind.ObjectMapper objectMapper,
                                                                         org.apache.druid.segment.IndexIO indexIO,
                                                                         org.apache.druid.segment.IndexMerger indexMerger,
                                                                         org.apache.druid.segment.incremental.RowIngestionMeters rowIngestionMeters,
                                                                         org.apache.druid.segment.incremental.ParseExceptionHandler parseExceptionHandler,
                                                                         boolean useMaxMemoryEstimates)
        Description copied from interface: AppenderatorsManager
        Creates an Appenderator suited for batch ingestion.
        Specified by:
        createOpenSegmentsOfflineAppenderatorForTask in interface AppenderatorsManager
      • createClosedSegmentsOfflineAppenderatorForTask

        public Appenderator createClosedSegmentsOfflineAppenderatorForTask​(String taskId,
                                                                           DataSchema schema,
                                                                           AppenderatorConfig config,
                                                                           FireDepartmentMetrics metrics,
                                                                           org.apache.druid.segment.loading.DataSegmentPusher dataSegmentPusher,
                                                                           com.fasterxml.jackson.databind.ObjectMapper objectMapper,
                                                                           org.apache.druid.segment.IndexIO indexIO,
                                                                           org.apache.druid.segment.IndexMerger indexMerger,
                                                                           org.apache.druid.segment.incremental.RowIngestionMeters rowIngestionMeters,
                                                                           org.apache.druid.segment.incremental.ParseExceptionHandler parseExceptionHandler,
                                                                           boolean useMaxMemoryEstimates)
        Specified by:
        createClosedSegmentsOfflineAppenderatorForTask in interface AppenderatorsManager
      • removeAppenderatorsForTask

        public void removeAppenderatorsForTask​(String taskId,
                                               String dataSource)
        Description copied from interface: AppenderatorsManager
        Removes any internal Appenderator-tracking state associated with the provided taskId. This method should be called when a task is finished using its Appenderators that were previously created by createRealtimeAppenderatorForTask or createOfflineAppenderatorForTask. The method can be called by the entity managing Tasks when the Tasks finish, such as ThreadingTaskRunner.
        Specified by:
        removeAppenderatorsForTask in interface AppenderatorsManager
      • getQueryRunnerForIntervals

        public <T> org.apache.druid.query.QueryRunner<T> getQueryRunnerForIntervals​(org.apache.druid.query.Query<T> query,
                                                                                    Iterable<org.joda.time.Interval> intervals)
        Description copied from interface: AppenderatorsManager
        Returns a query runner for the given intervals over the Appenderators managed by this AppenderatorsManager.
        Specified by:
        getQueryRunnerForIntervals in interface AppenderatorsManager
      • getQueryRunnerForSegments

        public <T> org.apache.druid.query.QueryRunner<T> getQueryRunnerForSegments​(org.apache.druid.query.Query<T> query,
                                                                                   Iterable<org.apache.druid.query.SegmentDescriptor> specs)
        Description copied from interface: AppenderatorsManager
        Returns a query runner for the given segment specs over the Appenderators managed by this AppenderatorsManager.
        Specified by:
        getQueryRunnerForSegments in interface AppenderatorsManager
      • shouldTaskMakeNodeAnnouncements

        public boolean shouldTaskMakeNodeAnnouncements()
        Description copied from interface: AppenderatorsManager
        As AppenderatorsManager implementions are service dependent (i.e., Peons and Indexers have different impls), this method allows Tasks to know whether they should announce themselves as nodes and segment servers to the rest of the cluster. Only Tasks running in Peons (i.e., as separate processes) should make their own individual node announcements.
        Specified by:
        shouldTaskMakeNodeAnnouncements in interface AppenderatorsManager