Package org.apache.beam.runners.spark
Internal implementation of the Beam runner for Apache Spark.
-
Interface Summary Interface Description SparkCommonPipelineOptions Spark runnerPipelineOptionshandles Spark execution-related configurations, such as the master address, and other user-related knobs.SparkContextOptions A customPipelineOptionsto work with properties related toJavaSparkContext.SparkPipelineOptions Spark runnerPipelineOptionshandles Spark execution-related configurations, such as the master address, batch-interval, and other user-related knobs.SparkPortableStreamingPipelineOptions Pipeline options specific to the Spark portable runner running a streaming job.TestSparkPipelineOptions ASparkPipelineOptionsfor tests. -
Class Summary Class Description SparkCommonPipelineOptions.StorageLevelFactory Returns Spark's default storage level for the Dataset or RDD API based on the respective runner.SparkCommonPipelineOptions.TmpCheckpointDirFactory Returns the default checkpoint directory of /tmp/${job.name}.SparkContextOptions.EmptyListenersList Returns an empty list, to avoid handling null.SparkJobInvoker Creates a job invocation to manage the Spark runner's execution of a portable pipeline.SparkJobServerDriver Driver program that starts a job server for the Spark runner.SparkJobServerDriver.SparkServerConfiguration Spark runner-specific Configuration for the jobServer.SparkNativePipelineVisitor Pipeline visitor for translating a Beam pipeline into equivalent Spark operations.SparkPipelineResult Represents a Spark pipeline execution result.SparkPipelineRunner Runs a portable pipeline on Apache Spark.SparkRunner The SparkRunner translate operations defined on a pipeline to a representation executable by Spark, and then submitting the job to Spark to be executed.SparkRunner.Evaluator Evaluator on the pipeline.SparkRunnerDebugger Pipeline runner which translates a Beam pipeline into equivalent Spark operations, without running them.SparkRunnerDebugger.DebugSparkPipelineResult PipelineResult of running aPipelineusingSparkRunnerDebuggerUseSparkRunnerDebugger.DebugSparkPipelineResult.getDebugString()to get aStringrepresentation of thePipelinetranslated into Spark native operations.SparkRunnerRegistrar SparkRunnerRegistrar.Options Registers theSparkPipelineOptions.SparkRunnerRegistrar.Runner Registers theSparkRunner.SparkTransformOverrides PTransformoverrides for Spark runner.TestSparkPipelineOptions.DefaultStopPipelineWatermarkFactory A factory to provide the default watermark to stop a pipeline that reads from an unbounded source.TestSparkRunner The SparkRunner translate operations defined on a pipeline to a representation executable by Spark, and then submitting the job to Spark to be executed.