Class SparkTranslationContext
- java.lang.Object
-
- org.apache.beam.runners.spark.translation.SparkTranslationContext
-
- Direct Known Subclasses:
SparkStreamingTranslationContext
public class SparkTranslationContext extends java.lang.ObjectTranslation context used to lazily store Spark data sets during portable pipeline translation and compute them after translation.
-
-
Constructor Summary
Constructors Constructor Description SparkTranslationContext(org.apache.spark.api.java.JavaSparkContext jsc, org.apache.beam.sdk.options.PipelineOptions options, org.apache.beam.runners.fnexecution.provisioning.JobInfo jobInfo)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidcomputeOutputs()Compute the outputs for all RDDs that are leaves in the DAG.org.apache.beam.runners.core.construction.SerializablePipelineOptionsgetSerializableOptions()org.apache.spark.api.java.JavaSparkContextgetSparkContext()intnextSinkId()Generate a unique pCollection id number to identify runner-generated sinks.DatasetpopDataset(java.lang.String pCollectionId)Retrieve the dataset for the pCollection id and remove it from the DAG's leaves.voidpushDataset(java.lang.String pCollectionId, Dataset dataset)Add output of transform to context.
-
-
-
Method Detail
-
getSparkContext
public org.apache.spark.api.java.JavaSparkContext getSparkContext()
-
getSerializableOptions
public org.apache.beam.runners.core.construction.SerializablePipelineOptions getSerializableOptions()
-
pushDataset
public void pushDataset(java.lang.String pCollectionId, Dataset dataset)Add output of transform to context.
-
popDataset
public Dataset popDataset(java.lang.String pCollectionId)
Retrieve the dataset for the pCollection id and remove it from the DAG's leaves.
-
computeOutputs
public void computeOutputs()
Compute the outputs for all RDDs that are leaves in the DAG.
-
nextSinkId
public int nextSinkId()
Generate a unique pCollection id number to identify runner-generated sinks.
-
-