Class StateSpecFunctions
- java.lang.Object
-
- org.apache.beam.runners.spark.stateful.StateSpecFunctions
-
public class StateSpecFunctions extends java.lang.ObjectA class containingStateSpecmappingFunctions.
-
-
Constructor Summary
Constructors Constructor Description StateSpecFunctions()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static <T,CheckpointMarkT extends org.apache.beam.sdk.io.UnboundedSource.CheckpointMark>
scala.Function3<org.apache.beam.sdk.io.Source<T>,scala.Option<CheckpointMarkT>,org.apache.spark.streaming.State<scala.Tuple2<byte[],org.joda.time.Instant>>,scala.Tuple2<java.lang.Iterable<byte[]>,SparkUnboundedSource.Metadata>>mapSourceFunction(org.apache.beam.runners.core.construction.SerializablePipelineOptions options, java.lang.String stepName)AStateSpecfunction to support reading from anUnboundedSource.
-
-
-
Method Detail
-
mapSourceFunction
public static <T,CheckpointMarkT extends org.apache.beam.sdk.io.UnboundedSource.CheckpointMark> scala.Function3<org.apache.beam.sdk.io.Source<T>,scala.Option<CheckpointMarkT>,org.apache.spark.streaming.State<scala.Tuple2<byte[],org.joda.time.Instant>>,scala.Tuple2<java.lang.Iterable<byte[]>,SparkUnboundedSource.Metadata>> mapSourceFunction(org.apache.beam.runners.core.construction.SerializablePipelineOptions options, java.lang.String stepName)
AStateSpecfunction to support reading from anUnboundedSource.This StateSpec function expects the following:
- Key: The (partitioned) Source to read from.
- Value: An optional
UnboundedSource.CheckpointMarkto start from. - State: A byte representation of the (previously) persisted CheckpointMark.
This stateful operation could be described as a flatMap over a single-element stream, which outputs all the elements read from the
UnboundedSourcefor this micro-batch. Since micro-batches are bounded, the provided UnboundedSource is wrapped by aMicrobatchSourcethat applies bounds in the form of duration and max records (per micro-batch).In order to avoid using Spark Guava's classes which pollute the classpath, we use the
StateSpec.function(scala.Function3)signature which employs scala's nativeOption, instead of theStateSpec.function(org.apache.spark.api.java.function.Function3)signature, which employs Guava'sOptional.See also SPARK-4819.
- Type Parameters:
T- The type of the input stream elements.CheckpointMarkT- The type of theUnboundedSource.CheckpointMark.- Parameters:
options- A serializableSerializablePipelineOptions.- Returns:
- The appropriate
StateSpecfunction.
-
-