Class MultiDoFnFunction<InputT,​OutputT>

  • Type Parameters:
    InputT - Input type for DoFunction.
    OutputT - Output type for DoFunction.
    All Implemented Interfaces:
    java.io.Serializable, org.apache.spark.api.java.function.PairFlatMapFunction<java.util.Iterator<org.apache.beam.sdk.util.WindowedValue<InputT>>,​org.apache.beam.sdk.values.TupleTag<?>,​org.apache.beam.sdk.util.WindowedValue<?>>

    public class MultiDoFnFunction<InputT,​OutputT>
    extends java.lang.Object
    implements org.apache.spark.api.java.function.PairFlatMapFunction<java.util.Iterator<org.apache.beam.sdk.util.WindowedValue<InputT>>,​org.apache.beam.sdk.values.TupleTag<?>,​org.apache.beam.sdk.util.WindowedValue<?>>
    DoFunctions ignore outputs that are not the main output. MultiDoFunctions deal with additional outputs by enriching the underlying data with multiple TupleTags.
    See Also:
    Serialized Form
    • Constructor Summary

      Constructors 
      Constructor Description
      MultiDoFnFunction​(MetricsContainerStepMapAccumulator metricsAccum, java.lang.String stepName, org.apache.beam.sdk.transforms.DoFn<InputT,​OutputT> doFn, org.apache.beam.runners.core.construction.SerializablePipelineOptions options, org.apache.beam.sdk.values.TupleTag<OutputT> mainOutputTag, java.util.List<org.apache.beam.sdk.values.TupleTag<?>> additionalOutputTags, org.apache.beam.sdk.coders.Coder<InputT> inputCoder, java.util.Map<org.apache.beam.sdk.values.TupleTag<?>,​org.apache.beam.sdk.coders.Coder<?>> outputCoders, java.util.Map<org.apache.beam.sdk.values.TupleTag<?>,​org.apache.beam.sdk.values.KV<org.apache.beam.sdk.values.WindowingStrategy<?,​?>,​SideInputBroadcast<?>>> sideInputs, org.apache.beam.sdk.values.WindowingStrategy<?,​?> windowingStrategy, boolean stateful, org.apache.beam.sdk.transforms.DoFnSchemaInformation doFnSchemaInformation, java.util.Map<java.lang.String,​org.apache.beam.sdk.values.PCollectionView<?>> sideInputMapping, boolean useBoundedConcurrentOutput)  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.util.Iterator<scala.Tuple2<org.apache.beam.sdk.values.TupleTag<?>,​org.apache.beam.sdk.util.WindowedValue<?>>> call​(java.util.Iterator<org.apache.beam.sdk.util.WindowedValue<InputT>> iter)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • MultiDoFnFunction

        public MultiDoFnFunction​(MetricsContainerStepMapAccumulator metricsAccum,
                                 java.lang.String stepName,
                                 org.apache.beam.sdk.transforms.DoFn<InputT,​OutputT> doFn,
                                 org.apache.beam.runners.core.construction.SerializablePipelineOptions options,
                                 org.apache.beam.sdk.values.TupleTag<OutputT> mainOutputTag,
                                 java.util.List<org.apache.beam.sdk.values.TupleTag<?>> additionalOutputTags,
                                 org.apache.beam.sdk.coders.Coder<InputT> inputCoder,
                                 java.util.Map<org.apache.beam.sdk.values.TupleTag<?>,​org.apache.beam.sdk.coders.Coder<?>> outputCoders,
                                 java.util.Map<org.apache.beam.sdk.values.TupleTag<?>,​org.apache.beam.sdk.values.KV<org.apache.beam.sdk.values.WindowingStrategy<?,​?>,​SideInputBroadcast<?>>> sideInputs,
                                 org.apache.beam.sdk.values.WindowingStrategy<?,​?> windowingStrategy,
                                 boolean stateful,
                                 org.apache.beam.sdk.transforms.DoFnSchemaInformation doFnSchemaInformation,
                                 java.util.Map<java.lang.String,​org.apache.beam.sdk.values.PCollectionView<?>> sideInputMapping,
                                 boolean useBoundedConcurrentOutput)
        Parameters:
        metricsAccum - The Spark AccumulatorV2 that backs the Beam metrics.
        doFn - The DoFn to be wrapped.
        options - The SerializablePipelineOptions.
        mainOutputTag - The main output TupleTag.
        additionalOutputTags - Additional output tags.
        inputCoder - The coder for the input.
        outputCoders - A map of all output coders.
        sideInputs - Side inputs used in this DoFn.
        windowingStrategy - Input WindowingStrategy.
        stateful - Stateful DoFn.
        useBoundedConcurrentOutput - If it should use bounded output for processing.
    • Method Detail

      • call

        public java.util.Iterator<scala.Tuple2<org.apache.beam.sdk.values.TupleTag<?>,​org.apache.beam.sdk.util.WindowedValue<?>>> call​(java.util.Iterator<org.apache.beam.sdk.util.WindowedValue<InputT>> iter)
                                                                                                                                      throws java.lang.Exception
        Specified by:
        call in interface org.apache.spark.api.java.function.PairFlatMapFunction<java.util.Iterator<org.apache.beam.sdk.util.WindowedValue<InputT>>,​org.apache.beam.sdk.values.TupleTag<?>,​org.apache.beam.sdk.util.WindowedValue<?>>
        Throws:
        java.lang.Exception