Class TestSparkRunner


  • public final class TestSparkRunner
    extends org.apache.beam.sdk.PipelineRunner<SparkPipelineResult>
    The SparkRunner translate operations defined on a pipeline to a representation executable by Spark, and then submitting the job to Spark to be executed. If we wanted to run a Beam pipeline with the default options of a single threaded spark instance in local mode, we would do the following:

    Pipeline p = [logic for pipeline creation] SparkPipelineResult result = (SparkPipelineResult) p.run();

    To create a pipeline runner to run against a different spark cluster, with a custom master url we would do the following:

    Pipeline p = [logic for pipeline creation] SparkPipelineOptions options = SparkPipelineOptionsFactory.create(); options.setSparkMaster("spark://host:port"); SparkPipelineResult result = (SparkPipelineResult) p.run();

    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static TestSparkRunner fromOptions​(org.apache.beam.sdk.options.PipelineOptions options)  
      SparkPipelineResult run​(org.apache.beam.sdk.Pipeline pipeline)  
      • Methods inherited from class org.apache.beam.sdk.PipelineRunner

        create, run, run
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • fromOptions

        public static TestSparkRunner fromOptions​(org.apache.beam.sdk.options.PipelineOptions options)