Interface LineageRelationProvider


public interface LineageRelationProvider
Interface for classes implementing org.apache.spark.sql.sources.RelationProvider.

The RelationProvider interface defines the createRelation method, which takes SQLContext and parameters as arguments. This interface enables lineage extraction from relation providers without directly depending on Spark's code, as the code may vary across different Spark versions.

To align with the createRelation method, the getLineageDatasetIdentifier method in this interface is designed to accept similar arguments. When implementing this method, classes can provide two versions: one that matches the arguments of createRelation, and another that throws an exception if it is ever called, ensuring compatibility across different implementations.

  • Method Summary

    Modifier and Type
    Method
    Description
    io.openlineage.client.utils.DatasetIdentifier
    getLineageDatasetIdentifier(String sparkListenerEventName, io.openlineage.client.OpenLineage openLineage, Object sqlContext, Object parameters)
    Returns a DatasetIdentifier containing the namespace and name of the dataset for lineage tracking purposes.
  • Method Details

    • getLineageDatasetIdentifier

      io.openlineage.client.utils.DatasetIdentifier getLineageDatasetIdentifier(String sparkListenerEventName, io.openlineage.client.OpenLineage openLineage, Object sqlContext, Object parameters)
      Returns a DatasetIdentifier containing the namespace and name of the dataset for lineage tracking purposes.
      Parameters:
      sparkListenerEventName - the name of the Spark listener event triggering the lineage extraction
      openLineage - an instance of OpenLineage used for lineage-related operations
      sqlContext - the SQL context, typically used in Spark SQL queries
      parameters - the parameters used by the relation provider to create the relation
      Returns:
      a DatasetIdentifier representing the dataset associated with the event