Interface SparkCommonPipelineOptions

  • All Superinterfaces:
    org.apache.beam.sdk.options.ApplicationNameOptions, org.apache.beam.sdk.options.FileStagingOptions, org.apache.beam.sdk.transforms.display.HasDisplayData, org.apache.beam.sdk.options.PipelineOptions, org.apache.beam.sdk.options.StreamingOptions
    All Known Subinterfaces:
    SparkContextOptions, SparkPipelineOptions, SparkPortableStreamingPipelineOptions, SparkStructuredStreamingPipelineOptions, TestSparkPipelineOptions

    public interface SparkCommonPipelineOptions
    extends org.apache.beam.sdk.options.PipelineOptions, org.apache.beam.sdk.options.StreamingOptions, org.apache.beam.sdk.options.ApplicationNameOptions, org.apache.beam.sdk.options.FileStagingOptions
    Spark runner PipelineOptions handles Spark execution-related configurations, such as the master address, and other user-related knobs.
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Interface Description
      static class  SparkCommonPipelineOptions.StorageLevelFactory
      Returns Spark's default storage level for the Dataset or RDD API based on the respective runner.
      static class  SparkCommonPipelineOptions.TmpCheckpointDirFactory
      Returns the default checkpoint directory of /tmp/${job.name}.
      • Nested classes/interfaces inherited from interface org.apache.beam.sdk.options.PipelineOptions

        org.apache.beam.sdk.options.PipelineOptions.AtomicLongFactory, org.apache.beam.sdk.options.PipelineOptions.CheckEnabled, org.apache.beam.sdk.options.PipelineOptions.DirectRunner, org.apache.beam.sdk.options.PipelineOptions.JobNameFactory, org.apache.beam.sdk.options.PipelineOptions.UserAgentFactory
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static java.lang.String DEFAULT_MASTER_URL  
    • Method Summary

      All Methods Static Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      java.lang.String getCheckpointDir()  
      java.lang.Boolean getEnableSparkMetricSinks()  
      java.lang.String getSparkMaster()  
      java.lang.String getStorageLevel()  
      static void prepareFilesToStage​(SparkCommonPipelineOptions options)
      Classpath contains non jar files (eg.
      void setCheckpointDir​(java.lang.String checkpointDir)  
      void setEnableSparkMetricSinks​(java.lang.Boolean enableSparkMetricSinks)  
      void setSparkMaster​(java.lang.String master)  
      void setStorageLevel​(java.lang.String storageLevel)  
      • Methods inherited from interface org.apache.beam.sdk.options.ApplicationNameOptions

        getAppName, setAppName
      • Methods inherited from interface org.apache.beam.sdk.options.FileStagingOptions

        getFilesToStage, setFilesToStage
      • Methods inherited from interface org.apache.beam.sdk.transforms.display.HasDisplayData

        populateDisplayData
      • Methods inherited from interface org.apache.beam.sdk.options.PipelineOptions

        as, getJobName, getOptionsId, getRunner, getStableUniqueNames, getTempLocation, getUserAgent, outputRuntimeOptions, revision, setJobName, setOptionsId, setRunner, setStableUniqueNames, setTempLocation, setUserAgent
      • Methods inherited from interface org.apache.beam.sdk.options.StreamingOptions

        getUpdateCompatibilityVersion, isStreaming, setStreaming, setUpdateCompatibilityVersion
    • Field Detail

    • Method Detail

      • getSparkMaster

        @String("local[4]")
        java.lang.String getSparkMaster()
      • setSparkMaster

        void setSparkMaster​(java.lang.String master)
      • setCheckpointDir

        void setCheckpointDir​(java.lang.String checkpointDir)
      • setStorageLevel

        void setStorageLevel​(java.lang.String storageLevel)
      • getEnableSparkMetricSinks

        @Boolean(true)
        java.lang.Boolean getEnableSparkMetricSinks()
      • setEnableSparkMetricSinks

        void setEnableSparkMetricSinks​(java.lang.Boolean enableSparkMetricSinks)
      • prepareFilesToStage

        @Internal
        static void prepareFilesToStage​(SparkCommonPipelineOptions options)
        Classpath contains non jar files (eg. directories with .class files or empty directories) will cause exception in running log. Though the SparkContext can handle this when running in local master, it's better not to include non-jars files in classpath.