public class ExecutableStageDoFnOperator<InputT,OutputT> extends DoFnOperator<InputT,OutputT>
FlinkExecutableStageFunction. It sends all
received elements to the SDK harness and emits the received back elements to the downstream
operators. It also takes care of handling side inputs and state.
TODO Integrate support for progress updates and metrics
DoFnOperator.BufferedOutputManager<OutputT>, DoFnOperator.FlinkStepContext, DoFnOperator.MultiOutputOutputManagerFactory<OutputT>additionalOutputTags, bufferingDoFnRunner, doFn, doFnRunner, keyCoder, keyedStateInternals, mainOutputTag, outputManager, outputManagerFactory, pushbackDoFnRunner, serializedOptions, sideInputHandler, sideInputReader, sideInputs, sideInputTagMapping, stepName, timerInternals, timerService, windowingStrategy| Constructor and Description |
|---|
ExecutableStageDoFnOperator(java.lang.String stepName,
org.apache.beam.sdk.coders.Coder<org.apache.beam.sdk.util.WindowedValue<InputT>> windowedInputCoder,
java.util.Map<org.apache.beam.sdk.values.TupleTag<?>,org.apache.beam.sdk.coders.Coder<?>> outputCoders,
org.apache.beam.sdk.values.TupleTag<OutputT> mainOutputTag,
java.util.List<org.apache.beam.sdk.values.TupleTag<?>> additionalOutputTags,
org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.OutputManagerFactory<OutputT> outputManagerFactory,
java.util.Map<java.lang.Integer,org.apache.beam.sdk.values.PCollectionView<?>> sideInputTagMapping,
java.util.Collection<org.apache.beam.sdk.values.PCollectionView<?>> sideInputs,
java.util.Map<org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.SideInputId,org.apache.beam.sdk.values.PCollectionView<?>> sideInputIds,
org.apache.beam.sdk.options.PipelineOptions options,
org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload payload,
org.apache.beam.runners.fnexecution.provisioning.JobInfo jobInfo,
FlinkExecutableStageContextFactory contextFactory,
java.util.Map<java.lang.String,org.apache.beam.sdk.values.TupleTag<?>> outputMap,
org.apache.beam.sdk.values.WindowingStrategy windowingStrategy,
org.apache.beam.sdk.coders.Coder keyCoder,
org.apache.flink.api.java.functions.KeySelector<org.apache.beam.sdk.util.WindowedValue<InputT>,?> keySelector)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
addSideInputValue(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<org.apache.beam.sdk.transforms.join.RawUnionValue> streamRecord)
Add the side input value.
|
long |
applyInputWatermarkHold(long inputWatermark)
Allows to apply a hold to the input watermark.
|
long |
applyOutputWatermarkHold(long currentOutputWatermark,
long potentialOutputWatermark)
Allows to apply a hold to the output watermark before it is send out.
|
void |
cleanUp()
Release all of the operator's resources.
|
protected org.apache.beam.runners.core.DoFnRunner<InputT,OutputT> |
createWrappingDoFnRunner(org.apache.beam.runners.core.DoFnRunner<InputT,OutputT> wrappedRunner,
org.apache.beam.runners.core.StepContext stepContext) |
protected void |
fireTimerInternal(java.nio.ByteBuffer key,
org.apache.beam.runners.core.TimerInternals.TimerData timer) |
void |
flushData()
Flush all remaining buffered data.
|
java.nio.ByteBuffer |
getCurrentKey() |
protected java.util.concurrent.locks.Lock |
getLockToAcquireForStateAccessDuringBundles()
Subclasses may provide a lock to ensure that the state backend is not accessed concurrently
during bundle execution.
|
void |
notifyCheckpointComplete(long checkpointId) |
void |
open() |
void |
setCurrentKey(java.lang.Object key)
We don't want to set anything here.
|
void |
setKeyContextElement1(org.apache.flink.streaming.runtime.streamrecord.StreamRecord record)
Note: This is only relevant when we have a stateful DoFn.
|
fireTimer, getBundleFinalizer, getCurrentOutputWatermark, getDoFn, getEffectiveInputWatermark, initializeState, invokeFinishBundle, onEventTime, onProcessingTime, prepareSnapshotPreBarrier, processElement, processElement1, processElement2, processWatermark, processWatermark1, processWatermark2, setBundleFinishedCallback, setPreBundleCallback, setup, snapshotStateclose, dispose, getTimeServiceManagerCompatgetChainingStrategy, getContainingTask, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, numEventTimeTimers, numProcessingTimeTimers, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, reportOrForwardLatencyMarker, setChainingStrategy, setKeyContextElement2, setProcessingTimeService, snapshotStateclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitprocessLatencyMarkerprocessLatencyMarker1, processLatencyMarker2public ExecutableStageDoFnOperator(java.lang.String stepName,
org.apache.beam.sdk.coders.Coder<org.apache.beam.sdk.util.WindowedValue<InputT>> windowedInputCoder,
java.util.Map<org.apache.beam.sdk.values.TupleTag<?>,org.apache.beam.sdk.coders.Coder<?>> outputCoders,
org.apache.beam.sdk.values.TupleTag<OutputT> mainOutputTag,
java.util.List<org.apache.beam.sdk.values.TupleTag<?>> additionalOutputTags,
org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.OutputManagerFactory<OutputT> outputManagerFactory,
java.util.Map<java.lang.Integer,org.apache.beam.sdk.values.PCollectionView<?>> sideInputTagMapping,
java.util.Collection<org.apache.beam.sdk.values.PCollectionView<?>> sideInputs,
java.util.Map<org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.SideInputId,org.apache.beam.sdk.values.PCollectionView<?>> sideInputIds,
org.apache.beam.sdk.options.PipelineOptions options,
org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload payload,
org.apache.beam.runners.fnexecution.provisioning.JobInfo jobInfo,
FlinkExecutableStageContextFactory contextFactory,
java.util.Map<java.lang.String,org.apache.beam.sdk.values.TupleTag<?>> outputMap,
org.apache.beam.sdk.values.WindowingStrategy windowingStrategy,
org.apache.beam.sdk.coders.Coder keyCoder,
org.apache.flink.api.java.functions.KeySelector<org.apache.beam.sdk.util.WindowedValue<InputT>,?> keySelector)
protected java.util.concurrent.locks.Lock getLockToAcquireForStateAccessDuringBundles()
DoFnOperatorgetLockToAcquireForStateAccessDuringBundles in class DoFnOperator<InputT,OutputT>public void open()
throws java.lang.Exception
open in interface org.apache.flink.streaming.api.operators.StreamOperator<org.apache.beam.sdk.util.WindowedValue<OutputT>>open in class DoFnOperator<InputT,OutputT>java.lang.Exceptionpublic final void notifyCheckpointComplete(long checkpointId)
throws java.lang.Exception
notifyCheckpointComplete in interface org.apache.flink.runtime.state.CheckpointListenernotifyCheckpointComplete in class DoFnOperator<InputT,OutputT>java.lang.Exceptionpublic void setKeyContextElement1(org.apache.flink.streaming.runtime.streamrecord.StreamRecord record)
setKeyContextElement1 in interface org.apache.flink.streaming.api.operators.StreamOperator<org.apache.beam.sdk.util.WindowedValue<OutputT>>setKeyContextElement1 in class org.apache.flink.streaming.api.operators.AbstractStreamOperator<org.apache.beam.sdk.util.WindowedValue<OutputT>>public void setCurrentKey(java.lang.Object key)
processElement, but this does not work when sending elements to the SDK harness which may be
processed at an arbitrary later point in time. State for keys is also accessed asynchronously
via state requests.
We set the key only as it is required for 1) State requests 2) Timers (setting/firing).
setCurrentKey in interface org.apache.flink.streaming.api.operators.KeyContextsetCurrentKey in class org.apache.flink.streaming.api.operators.AbstractStreamOperator<org.apache.beam.sdk.util.WindowedValue<OutputT>>public java.nio.ByteBuffer getCurrentKey()
getCurrentKey in interface org.apache.flink.streaming.api.operators.KeyContextgetCurrentKey in class org.apache.flink.streaming.api.operators.AbstractStreamOperator<org.apache.beam.sdk.util.WindowedValue<OutputT>>protected void fireTimerInternal(java.nio.ByteBuffer key,
org.apache.beam.runners.core.TimerInternals.TimerData timer)
fireTimerInternal in class DoFnOperator<InputT,OutputT>public void flushData()
throws java.lang.Exception
AbstractStreamOperatorCompatjava.lang.Exceptionpublic void cleanUp()
throws java.lang.Exception
AbstractStreamOperatorCompatjava.lang.Exceptionprotected void addSideInputValue(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<org.apache.beam.sdk.transforms.join.RawUnionValue> streamRecord)
DoFnOperatorIterable. Subclasses may elect to perform materialization in
state and receive side input incrementally instead.addSideInputValue in class DoFnOperator<InputT,OutputT>protected org.apache.beam.runners.core.DoFnRunner<InputT,OutputT> createWrappingDoFnRunner(org.apache.beam.runners.core.DoFnRunner<InputT,OutputT> wrappedRunner, org.apache.beam.runners.core.StepContext stepContext)
createWrappingDoFnRunner in class DoFnOperator<InputT,OutputT>public long applyInputWatermarkHold(long inputWatermark)
DoFnOperatorapplyInputWatermarkHold in class DoFnOperator<InputT,OutputT>public long applyOutputWatermarkHold(long currentOutputWatermark,
long potentialOutputWatermark)
DoFnOperatorapplyOutputWatermarkHold in class DoFnOperator<InputT,OutputT>currentOutputWatermark - the current output watermarkpotentialOutputWatermark - The potential new output watermark which can be adjusted, if
needed. The input watermark hold has already been applied.