Class SparkGroupAlsoByWindowViaWindowSet

  • All Implemented Interfaces:
    java.io.Serializable

    public class SparkGroupAlsoByWindowViaWindowSet
    extends java.lang.Object
    implements java.io.Serializable
    An implementation of GroupByKeyViaGroupByKeyOnly.GroupAlsoByWindow logic for grouping by windows and controlling trigger firings and pane accumulation.

    This implementation is a composite of Spark transformations revolving around state management using Spark's PairDStreamFunctions.updateStateByKey(scala.Function1, org.apache.spark.Partitioner, boolean, scala.reflect.ClassTag) to update state with new data and timers.

    Using updateStateByKey allows to scan through the entire state visiting not just the updated state (new values for key) but also check if timers are ready to fire. Since updateStateByKey bounds the types of state and output to be the same, a (state, output) tuple is used, filtering the state (and output if no firing) in the following steps.

    See Also:
    Serialized Form
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static <K,​InputT,​W extends org.apache.beam.sdk.transforms.windowing.BoundedWindow>
      org.apache.spark.streaming.api.java.JavaDStream<org.apache.beam.sdk.util.WindowedValue<org.apache.beam.sdk.values.KV<K,​java.lang.Iterable<InputT>>>>
      groupByKeyAndWindow​(org.apache.spark.streaming.api.java.JavaDStream<org.apache.beam.sdk.util.WindowedValue<org.apache.beam.sdk.values.KV<K,​InputT>>> inputDStream, org.apache.beam.sdk.coders.Coder<K> keyCoder, org.apache.beam.sdk.coders.Coder<org.apache.beam.sdk.util.WindowedValue<InputT>> wvCoder, org.apache.beam.sdk.values.WindowingStrategy<?,​W> windowingStrategy, org.apache.beam.runners.core.construction.SerializablePipelineOptions options, java.util.List<java.lang.Integer> sourceIds, java.lang.String transformFullName)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • SparkGroupAlsoByWindowViaWindowSet

        public SparkGroupAlsoByWindowViaWindowSet()
    • Method Detail

      • groupByKeyAndWindow

        public static <K,​InputT,​W extends org.apache.beam.sdk.transforms.windowing.BoundedWindow> org.apache.spark.streaming.api.java.JavaDStream<org.apache.beam.sdk.util.WindowedValue<org.apache.beam.sdk.values.KV<K,​java.lang.Iterable<InputT>>>> groupByKeyAndWindow​(org.apache.spark.streaming.api.java.JavaDStream<org.apache.beam.sdk.util.WindowedValue<org.apache.beam.sdk.values.KV<K,​InputT>>> inputDStream,
                                                                                                                                                                                                                                                                                             org.apache.beam.sdk.coders.Coder<K> keyCoder,
                                                                                                                                                                                                                                                                                             org.apache.beam.sdk.coders.Coder<org.apache.beam.sdk.util.WindowedValue<InputT>> wvCoder,
                                                                                                                                                                                                                                                                                             org.apache.beam.sdk.values.WindowingStrategy<?,​W> windowingStrategy,
                                                                                                                                                                                                                                                                                             org.apache.beam.runners.core.construction.SerializablePipelineOptions options,
                                                                                                                                                                                                                                                                                             java.util.List<java.lang.Integer> sourceIds,
                                                                                                                                                                                                                                                                                             java.lang.String transformFullName)