@DoFn.UnboundedPerElement public class ReadChangeStreamPartitionDoFn extends org.apache.beam.sdk.transforms.DoFn<PartitionMetadata,DataChangeRecord> implements java.io.Serializable
The processing of a partition is delegated to the QueryChangeStreamAction.
org.apache.beam.sdk.transforms.DoFn.AlwaysFetched, org.apache.beam.sdk.transforms.DoFn.BoundedPerElement, org.apache.beam.sdk.transforms.DoFn.BundleFinalizer, org.apache.beam.sdk.transforms.DoFn.Element, org.apache.beam.sdk.transforms.DoFn.FieldAccess, org.apache.beam.sdk.transforms.DoFn.FinishBundle, org.apache.beam.sdk.transforms.DoFn.FinishBundleContext, org.apache.beam.sdk.transforms.DoFn.GetInitialRestriction, org.apache.beam.sdk.transforms.DoFn.GetInitialWatermarkEstimatorState, org.apache.beam.sdk.transforms.DoFn.GetRestrictionCoder, org.apache.beam.sdk.transforms.DoFn.GetSize, org.apache.beam.sdk.transforms.DoFn.GetWatermarkEstimatorStateCoder, org.apache.beam.sdk.transforms.DoFn.Key, org.apache.beam.sdk.transforms.DoFn.MultiOutputReceiver, org.apache.beam.sdk.transforms.DoFn.NewTracker, org.apache.beam.sdk.transforms.DoFn.NewWatermarkEstimator, org.apache.beam.sdk.transforms.DoFn.OnTimer, org.apache.beam.sdk.transforms.DoFn.OnTimerContext, org.apache.beam.sdk.transforms.DoFn.OnTimerFamily, org.apache.beam.sdk.transforms.DoFn.OnWindowExpiration, org.apache.beam.sdk.transforms.DoFn.OnWindowExpirationContext, org.apache.beam.sdk.transforms.DoFn.OutputReceiver<T>, org.apache.beam.sdk.transforms.DoFn.ProcessContext, org.apache.beam.sdk.transforms.DoFn.ProcessContinuation, org.apache.beam.sdk.transforms.DoFn.ProcessElement, org.apache.beam.sdk.transforms.DoFn.RequiresStableInput, org.apache.beam.sdk.transforms.DoFn.RequiresTimeSortedInput, org.apache.beam.sdk.transforms.DoFn.Restriction, org.apache.beam.sdk.transforms.DoFn.Setup, org.apache.beam.sdk.transforms.DoFn.SideInput, org.apache.beam.sdk.transforms.DoFn.SplitRestriction, org.apache.beam.sdk.transforms.DoFn.StartBundle, org.apache.beam.sdk.transforms.DoFn.StartBundleContext, org.apache.beam.sdk.transforms.DoFn.StateId, org.apache.beam.sdk.transforms.DoFn.Teardown, org.apache.beam.sdk.transforms.DoFn.TimerFamily, org.apache.beam.sdk.transforms.DoFn.TimerId, org.apache.beam.sdk.transforms.DoFn.Timestamp, org.apache.beam.sdk.transforms.DoFn.TruncateRestriction, org.apache.beam.sdk.transforms.DoFn.UnboundedPerElement, org.apache.beam.sdk.transforms.DoFn.WatermarkEstimatorState, org.apache.beam.sdk.transforms.DoFn.WindowedContext| Constructor and Description |
|---|
ReadChangeStreamPartitionDoFn(DaoFactory daoFactory,
MapperFactory mapperFactory,
ActionFactory actionFactory,
ChangeStreamMetrics metrics)
This class needs a
DaoFactory to build DAOs to access the partition metadata tables and
to perform the change streams query. |
| Modifier and Type | Method and Description |
|---|---|
org.joda.time.Instant |
getInitialWatermarkEstimatorState(PartitionMetadata partition) |
double |
getSize(PartitionMetadata partition,
TimestampRange range) |
TimestampRange |
initialRestriction(PartitionMetadata partition)
The restriction for a partition will be defined from the start and end timestamp to query the
partition for.
|
ReadChangeStreamPartitionRangeTracker |
newTracker(PartitionMetadata partition,
TimestampRange range) |
org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> |
newWatermarkEstimator(org.joda.time.Instant watermarkEstimatorState) |
org.apache.beam.sdk.transforms.DoFn.ProcessContinuation |
processElement(PartitionMetadata partition,
org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,com.google.cloud.Timestamp> tracker,
org.apache.beam.sdk.transforms.DoFn.OutputReceiver<DataChangeRecord> receiver,
org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator,
org.apache.beam.sdk.transforms.DoFn.BundleFinalizer bundleFinalizer)
Performs a change stream query for a given partition.
|
void |
setThroughputEstimator(BytesThroughputEstimator<DataChangeRecord> throughputEstimator)
Sets the estimator to calculate the backlog of this function.
|
void |
setup()
Constructs instances for the
PartitionMetadataDao, ChangeStreamDao, ChangeStreamRecordMapper, PartitionMetadataMapper, DataChangeRecordAction,
HeartbeatRecordAction, ChildPartitionsRecordAction and QueryChangeStreamAction. |
public ReadChangeStreamPartitionDoFn(DaoFactory daoFactory, MapperFactory mapperFactory, ActionFactory actionFactory, ChangeStreamMetrics metrics)
DaoFactory to build DAOs to access the partition metadata tables and
to perform the change streams query. It uses mappers to transform database rows into the ChangeStreamRecord model. It uses the
ActionFactory to construct the action dispatchers, which will perform the change stream
query and process each type of record received. It emits metrics for the partition using the
ChangeStreamMetrics.daoFactory - the DaoFactory to construct PartitionMetadataDaos and ChangeStreamDaosmapperFactory - the MapperFactory to construct ChangeStreamRecordMappersactionFactory - the ActionFactory to construct actionsmetrics - the ChangeStreamMetrics to emit partition related metrics@DoFn.GetInitialWatermarkEstimatorState
public org.joda.time.Instant getInitialWatermarkEstimatorState(@DoFn.Element
PartitionMetadata partition)
@DoFn.NewWatermarkEstimator
public org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> newWatermarkEstimator(@DoFn.WatermarkEstimatorState
org.joda.time.Instant watermarkEstimatorState)
@DoFn.GetInitialRestriction public TimestampRange initialRestriction(@DoFn.Element PartitionMetadata partition)
TimestampRange restriction represents a closed-open interval, while
the start / end timestamps represent a closed-closed interval, so we add 1 nanosecond to the
end timestamp to convert it to closed-open.
In this function we also update the partition state to PartitionMetadata.State#RUNNING.
partition - the partition to be queried@DoFn.GetSize
public double getSize(@DoFn.Element
PartitionMetadata partition,
@DoFn.Restriction
TimestampRange range)
throws java.lang.Exception
java.lang.Exception@DoFn.NewTracker public ReadChangeStreamPartitionRangeTracker newTracker(@DoFn.Element PartitionMetadata partition, @DoFn.Restriction TimestampRange range)
@DoFn.Setup public void setup()
PartitionMetadataDao, ChangeStreamDao, ChangeStreamRecordMapper, PartitionMetadataMapper, DataChangeRecordAction,
HeartbeatRecordAction, ChildPartitionsRecordAction and QueryChangeStreamAction.@DoFn.ProcessElement
public org.apache.beam.sdk.transforms.DoFn.ProcessContinuation processElement(@DoFn.Element
PartitionMetadata partition,
org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,com.google.cloud.Timestamp> tracker,
org.apache.beam.sdk.transforms.DoFn.OutputReceiver<DataChangeRecord> receiver,
org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator,
org.apache.beam.sdk.transforms.DoFn.BundleFinalizer bundleFinalizer)
The processing of a partition is delegated to the QueryChangeStreamAction.
partition - the partition to be queriedtracker - an instance of ReadChangeStreamPartitionRangeTrackerreceiver - a DataChangeRecord OutputReceiverwatermarkEstimator - a ManualWatermarkEstimator of InstantbundleFinalizer - the bundle finalizerProcessContinuation#stop() if a record timestamp could not be claimed or if
the partition processing has finishedpublic void setThroughputEstimator(BytesThroughputEstimator<DataChangeRecord> throughputEstimator)
throughputEstimator - an estimator to calculate local throughput.