Class SparkOpenLineageExtensionVisitor
java.lang.Object
io.openlineage.spark.shade.extension.v1.lifecycle.plan.SparkOpenLineageExtensionVisitor
This class serves as a visitor that wraps method calls for handling input and output lineage in
Spark jobs, as defined in the OpenLineage-Spark extension.
The OpenLineage-Spark library uses reflection to access these wrapper methods for extracting
lineage information from Spark's LogicalPlan and other relevant components. The visitor class
handles different types of lineage nodes, such as InputLineageNode and OutputLineageNode, and allows conversion to a format suitable for lineage tracking.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionApplies the visitor to aLineageRelation,InputLineageNode, orOutputLineageNode, extracting and serializing the relevant lineage information.Applies the visitor to aLineageRelationProvider, extracting lineage information such as theDatasetIdentifierfrom the providedlineageNode.booleanisDefinedAt(Object lineageNode) Determines if the givenlineageNodeis of a type that this visitor can process.
-
Constructor Details
-
SparkOpenLineageExtensionVisitor
public SparkOpenLineageExtensionVisitor()
-
-
Method Details
-
isDefinedAt
Determines if the givenlineageNodeis of a type that this visitor can process. Specifically, it checks if the object is an instance ofLineageRelationProvider,LineageRelation,InputLineageNode, orOutputLineageNode.- Parameters:
lineageNode- the node representing a lineage component- Returns:
trueif the node is of a supported type,falseotherwise
-
apply
public Map<String,Object> apply(Object lineageNode, String sparkListenerEventName, Object sqlContext, Object parameters) Applies the visitor to aLineageRelationProvider, extracting lineage information such as theDatasetIdentifierfrom the providedlineageNode.- Parameters:
lineageNode- the lineage node to processsparkListenerEventName- the name of the Spark listener eventsqlContext- the SQL context of the current Spark executionparameters- additional parameters relevant to the lineage extraction- Returns:
- a map containing lineage information in a serialized format
-
apply
Applies the visitor to aLineageRelation,InputLineageNode, orOutputLineageNode, extracting and serializing the relevant lineage information.- Parameters:
lineageNode- the lineage node to processsparkListenerEventName- the name of the Spark listener event- Returns:
- a map containing the serialized lineage data
-