@DoFn.UnboundedPerElement public class DetectNewPartitionsDoFn extends org.apache.beam.sdk.transforms.DoFn<PartitionMetadata,PartitionMetadata>
PartitionMetadata.State.CREATED, update their state to PartitionMetadata.State.SCHEDULED and output them to the next
stage in the pipeline.org.apache.beam.sdk.transforms.DoFn.AlwaysFetched, org.apache.beam.sdk.transforms.DoFn.BoundedPerElement, org.apache.beam.sdk.transforms.DoFn.BundleFinalizer, org.apache.beam.sdk.transforms.DoFn.Element, org.apache.beam.sdk.transforms.DoFn.FieldAccess, org.apache.beam.sdk.transforms.DoFn.FinishBundle, org.apache.beam.sdk.transforms.DoFn.FinishBundleContext, org.apache.beam.sdk.transforms.DoFn.GetInitialRestriction, org.apache.beam.sdk.transforms.DoFn.GetInitialWatermarkEstimatorState, org.apache.beam.sdk.transforms.DoFn.GetRestrictionCoder, org.apache.beam.sdk.transforms.DoFn.GetSize, org.apache.beam.sdk.transforms.DoFn.GetWatermarkEstimatorStateCoder, org.apache.beam.sdk.transforms.DoFn.Key, org.apache.beam.sdk.transforms.DoFn.MultiOutputReceiver, org.apache.beam.sdk.transforms.DoFn.NewTracker, org.apache.beam.sdk.transforms.DoFn.NewWatermarkEstimator, org.apache.beam.sdk.transforms.DoFn.OnTimer, org.apache.beam.sdk.transforms.DoFn.OnTimerContext, org.apache.beam.sdk.transforms.DoFn.OnTimerFamily, org.apache.beam.sdk.transforms.DoFn.OnWindowExpiration, org.apache.beam.sdk.transforms.DoFn.OnWindowExpirationContext, org.apache.beam.sdk.transforms.DoFn.OutputReceiver<T>, org.apache.beam.sdk.transforms.DoFn.ProcessContext, org.apache.beam.sdk.transforms.DoFn.ProcessContinuation, org.apache.beam.sdk.transforms.DoFn.ProcessElement, org.apache.beam.sdk.transforms.DoFn.RequiresStableInput, org.apache.beam.sdk.transforms.DoFn.RequiresTimeSortedInput, org.apache.beam.sdk.transforms.DoFn.Restriction, org.apache.beam.sdk.transforms.DoFn.Setup, org.apache.beam.sdk.transforms.DoFn.SideInput, org.apache.beam.sdk.transforms.DoFn.SplitRestriction, org.apache.beam.sdk.transforms.DoFn.StartBundle, org.apache.beam.sdk.transforms.DoFn.StartBundleContext, org.apache.beam.sdk.transforms.DoFn.StateId, org.apache.beam.sdk.transforms.DoFn.Teardown, org.apache.beam.sdk.transforms.DoFn.TimerFamily, org.apache.beam.sdk.transforms.DoFn.TimerId, org.apache.beam.sdk.transforms.DoFn.Timestamp, org.apache.beam.sdk.transforms.DoFn.TruncateRestriction, org.apache.beam.sdk.transforms.DoFn.UnboundedPerElement, org.apache.beam.sdk.transforms.DoFn.WatermarkEstimatorState, org.apache.beam.sdk.transforms.DoFn.WindowedContext| Constructor and Description |
|---|
DetectNewPartitionsDoFn(DaoFactory daoFactory,
MapperFactory mapperFactory,
ActionFactory actionFactory,
ChangeStreamMetrics metrics)
This class needs a
DaoFactory to build DAOs to access the partition metadata tables. |
| Modifier and Type | Method and Description |
|---|---|
org.joda.time.Instant |
getInitialWatermarkEstimatorState(PartitionMetadata partition) |
TimestampRange |
initialRestriction(PartitionMetadata partition)
Uses an
TimestampRange with a max range. |
org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> |
newWatermarkEstimator(org.joda.time.Instant watermarkEstimatorState) |
org.apache.beam.sdk.transforms.DoFn.ProcessContinuation |
processElement(org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,com.google.cloud.Timestamp> tracker,
org.apache.beam.sdk.transforms.DoFn.OutputReceiver<PartitionMetadata> receiver,
org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator)
Main processing function for the
DetectNewPartitionsDoFn function. |
DetectNewPartitionsRangeTracker |
restrictionTracker(TimestampRange restriction) |
void |
setup()
Obtains the instance of
DetectNewPartitionsAction. |
public DetectNewPartitionsDoFn(DaoFactory daoFactory, MapperFactory mapperFactory, ActionFactory actionFactory, ChangeStreamMetrics metrics)
DaoFactory to build DAOs to access the partition metadata tables. It
uses mappers to transform database rows into the PartitionMetadata model. It builds the
delegating action class using the ActionFactory. It emits metrics for the partitions
read using the ChangeStreamMetrics. It re-schedules the process element function to be
executed according to the default resume interval as in DEFAULT_RESUME_DURATION (best effort).daoFactory - the DaoFactory to construct PartitionMetadataDaosmapperFactory - the MapperFactory to construct PartitionMetadataMappersactionFactory - the ActionFactory to construct actionsmetrics - the ChangeStreamMetrics to emit partition related metrics@DoFn.GetInitialWatermarkEstimatorState
public org.joda.time.Instant getInitialWatermarkEstimatorState(@DoFn.Element
PartitionMetadata partition)
@DoFn.NewWatermarkEstimator
public org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> newWatermarkEstimator(@DoFn.WatermarkEstimatorState
org.joda.time.Instant watermarkEstimatorState)
@DoFn.GetInitialRestriction public TimestampRange initialRestriction(@DoFn.Element PartitionMetadata partition)
TimestampRange with a max range. This is because it does not know beforehand
how many partitions it will schedule.@DoFn.NewTracker public DetectNewPartitionsRangeTracker restrictionTracker(@DoFn.Restriction TimestampRange restriction)
@DoFn.Setup public void setup()
DetectNewPartitionsAction.@DoFn.ProcessElement public org.apache.beam.sdk.transforms.DoFn.ProcessContinuation processElement(org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,com.google.cloud.Timestamp> tracker, org.apache.beam.sdk.transforms.DoFn.OutputReceiver<PartitionMetadata> receiver, org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator)
DetectNewPartitionsDoFn function. It will delegate to
the DetectNewPartitionsAction class.