Packages

p

io.delta.flink

source

package source

Type Members

  1. class DeltaSource[T] extends DeltaSourceInternal[T]

    A unified data source that reads Delta table - both in batch and in streaming mode.

    A unified data source that reads Delta table - both in batch and in streaming mode.

    This source supports all (distributed) file systems and object stores that can be accessed via the Flink's FileSystem class.

    To create a new instance of DeltaSource for a Delta table that will produce RowData records that contain all table columns:

        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        ...
        // Bounded mode.
        DeltaSource<RowData> deltaSource = DeltaSource.forBoundedRowData(
                   new Path("s3://some/path"),
                   new Configuration()
                )
                .versionAsOf(10)
                .build();
    
        env.fromSource(deltaSource, WatermarkStrategy.noWatermarks(), "delta-source")
    
        ..........
        // Continuous mode.
        DeltaSource<RowData> deltaSource = DeltaSource.forContinuousRowData(
                   new Path("s3://some/path"),
                   new Configuration()
                  )
                 .updateCheckIntervalMillis(1000)
                 .startingVersion(10)
                 .build();
    
        env.fromSource(deltaSource, WatermarkStrategy.noWatermarks(), "delta-source")
    

    To create a new instance of DeltaSource for a Delta table that will produce RowData records with user-selected columns:

        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        ...
        // Bounded mode.
        DeltaSource<RowData> deltaSource = DeltaSource.forBoundedRowData(
                   new Path("s3://some/path"),
                   new Configuration()
                )
                .columnNames(Arrays.asList("col1", "col2"))
                .versionAsOf(10)
                .build();
    
        env.fromSource(deltaSource, WatermarkStrategy.noWatermarks(), "delta-source")
    
        ..........
        // Continuous mode.
        DeltaSource<RowData> deltaSource = DeltaSource.forContinuousRowData(
                   new Path("s3://some/path"),
                   new Configuration()
                  )
                  .columnNames(Arrays.asList("col1", "col2"))
                  .updateCheckIntervalMillis(1000)
                  .startingVersion(10)
                  .build();
    
        env.fromSource(deltaSource, WatermarkStrategy.noWatermarks(), "delta-source")
    
    When using columnNames(...) method, the source will discover the data types for the given columns from the Delta log.

  2. class RowDataBoundedDeltaSourceBuilder extends BoundedDeltaSourceBuilder[RowData, RowDataBoundedDeltaSourceBuilder]

    A builder class for DeltaSource for a stream of RowData where the created source instance will operate in Bounded mode.

    A builder class for DeltaSource for a stream of RowData where the created source instance will operate in Bounded mode.

    For most common use cases use DeltaSource#forBoundedRowData utility method to instantiate the source. After instantiation of this builder you can either call RowDataBoundedDeltaSourceBuilder#build() method to get the instance of a DeltaSource or configure additional options using builder's API.

  3. class RowDataContinuousDeltaSourceBuilder extends ContinuousDeltaSourceBuilder[RowData, RowDataContinuousDeltaSourceBuilder]

    A builder class for DeltaSource for a stream of RowData where the created source instance will operate in Continuous mode.

    A builder class for DeltaSource for a stream of RowData where the created source instance will operate in Continuous mode.

    In Continuous mode, the DeltaSource will, by default, load the full state of the latest table version, and then start monitoring for changes. If you use either the RowDataContinuousDeltaSourceBuilder#startingVersion or RowDataContinuousDeltaSourceBuilder#startingTimestamp APIs, then the DeltaSource will start monitoring for changes from that historical version. It will not load the full table state at that historical table version.

    For most common use cases use DeltaSource#forContinuousRowData utility method to instantiate the source. After instantiation of this builder you can either call RowDataBoundedDeltaSourceBuilder#build() method to get the instance of a DeltaSource or configure additional options using builder's API.

Ungrouped