Class GroupCombineFunctions


  • public class GroupCombineFunctions
    extends java.lang.Object
    A set of group/combine functions to apply to Spark RDDs.
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static <InputT,​OutputT,​AccumT>
      SparkCombineFn.WindowedAccumulator<InputT,​InputT,​AccumT,​?>
      combineGlobally​(org.apache.spark.api.java.JavaRDD<org.apache.beam.sdk.util.WindowedValue<InputT>> rdd, SparkCombineFn<InputT,​InputT,​AccumT,​OutputT> sparkCombineFn, org.apache.beam.sdk.coders.Coder<AccumT> aCoder, org.apache.beam.sdk.values.WindowingStrategy<?,​?> windowingStrategy)
      Apply a composite Combine.Globally transformation.
      static <K,​V,​AccumT>
      org.apache.spark.api.java.JavaPairRDD<K,​SparkCombineFn.WindowedAccumulator<org.apache.beam.sdk.values.KV<K,​V>,​V,​AccumT,​?>>
      combinePerKey​(org.apache.spark.api.java.JavaRDD<org.apache.beam.sdk.util.WindowedValue<org.apache.beam.sdk.values.KV<K,​V>>> rdd, SparkCombineFn<org.apache.beam.sdk.values.KV<K,​V>,​V,​AccumT,​?> sparkCombineFn, org.apache.beam.sdk.coders.Coder<K> keyCoder, org.apache.beam.sdk.coders.Coder<V> valueCoder, org.apache.beam.sdk.coders.Coder<AccumT> aCoder, org.apache.beam.sdk.values.WindowingStrategy<?,​?> windowingStrategy)
      Apply a composite Combine.PerKey transformation.
      static <K,​V>
      org.apache.spark.api.java.JavaRDD<org.apache.beam.sdk.values.KV<K,​java.lang.Iterable<org.apache.beam.sdk.util.WindowedValue<V>>>>
      groupByKeyOnly​(org.apache.spark.api.java.JavaRDD<org.apache.beam.sdk.util.WindowedValue<org.apache.beam.sdk.values.KV<K,​V>>> rdd, org.apache.beam.sdk.coders.Coder<K> keyCoder, org.apache.beam.sdk.util.WindowedValue.WindowedValueCoder<V> wvCoder, @Nullable org.apache.spark.Partitioner partitioner)
      An implementation of GroupByKeyViaGroupByKeyOnly.GroupByKeyOnly for the Spark runner.
      static <T> org.apache.spark.api.java.JavaRDD<org.apache.beam.sdk.util.WindowedValue<T>> reshuffle​(org.apache.spark.api.java.JavaRDD<org.apache.beam.sdk.util.WindowedValue<T>> rdd, org.apache.beam.sdk.util.WindowedValue.WindowedValueCoder<T> wvCoder)
      An implementation of Reshuffle for the Spark runner.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • GroupCombineFunctions

        public GroupCombineFunctions()
    • Method Detail

      • groupByKeyOnly

        public static <K,​V> org.apache.spark.api.java.JavaRDD<org.apache.beam.sdk.values.KV<K,​java.lang.Iterable<org.apache.beam.sdk.util.WindowedValue<V>>>> groupByKeyOnly​(org.apache.spark.api.java.JavaRDD<org.apache.beam.sdk.util.WindowedValue<org.apache.beam.sdk.values.KV<K,​V>>> rdd,
                                                                                                                                                                                         org.apache.beam.sdk.coders.Coder<K> keyCoder,
                                                                                                                                                                                         org.apache.beam.sdk.util.WindowedValue.WindowedValueCoder<V> wvCoder,
                                                                                                                                                                                         @Nullable org.apache.spark.Partitioner partitioner)
        An implementation of GroupByKeyViaGroupByKeyOnly.GroupByKeyOnly for the Spark runner.
      • combineGlobally

        public static <InputT,​OutputT,​AccumT> SparkCombineFn.WindowedAccumulator<InputT,​InputT,​AccumT,​?> combineGlobally​(org.apache.spark.api.java.JavaRDD<org.apache.beam.sdk.util.WindowedValue<InputT>> rdd,
                                                                                                                                                       SparkCombineFn<InputT,​InputT,​AccumT,​OutputT> sparkCombineFn,
                                                                                                                                                       org.apache.beam.sdk.coders.Coder<AccumT> aCoder,
                                                                                                                                                       org.apache.beam.sdk.values.WindowingStrategy<?,​?> windowingStrategy)
        Apply a composite Combine.Globally transformation.
      • combinePerKey

        public static <K,​V,​AccumT> org.apache.spark.api.java.JavaPairRDD<K,​SparkCombineFn.WindowedAccumulator<org.apache.beam.sdk.values.KV<K,​V>,​V,​AccumT,​?>> combinePerKey​(org.apache.spark.api.java.JavaRDD<org.apache.beam.sdk.util.WindowedValue<org.apache.beam.sdk.values.KV<K,​V>>> rdd,
                                                                                                                                                                                                                      SparkCombineFn<org.apache.beam.sdk.values.KV<K,​V>,​V,​AccumT,​?> sparkCombineFn,
                                                                                                                                                                                                                      org.apache.beam.sdk.coders.Coder<K> keyCoder,
                                                                                                                                                                                                                      org.apache.beam.sdk.coders.Coder<V> valueCoder,
                                                                                                                                                                                                                      org.apache.beam.sdk.coders.Coder<AccumT> aCoder,
                                                                                                                                                                                                                      org.apache.beam.sdk.values.WindowingStrategy<?,​?> windowingStrategy)
        Apply a composite Combine.PerKey transformation.

        This aggregation will apply Beam's Combine.CombineFn via Spark's JavaPairRDD.combineByKey(Function, Function2, Function2) aggregation. For streaming, this will be called from within a serialized context (DStream's transform callback), so passed arguments need to be Serializable.

      • reshuffle

        public static <T> org.apache.spark.api.java.JavaRDD<org.apache.beam.sdk.util.WindowedValue<T>> reshuffle​(org.apache.spark.api.java.JavaRDD<org.apache.beam.sdk.util.WindowedValue<T>> rdd,
                                                                                                                 org.apache.beam.sdk.util.WindowedValue.WindowedValueCoder<T> wvCoder)
        An implementation of Reshuffle for the Spark runner.