Packages

c

io.delta.flink.sink.internal.writer

DeltaWriterBucket

class DeltaWriterBucket[IN] extends AnyRef

Internal implementation for writing the actual events to the underlying files in the correct buckets / partitions.

In reference to the Flink's org.apache.flink.api.connector.sink.Sink topology one of its main components is org.apache.flink.api.connector.sink.SinkWriter which in case of DeltaSink is implemented as DeltaWriter. However, to comply with DeltaLake's support for partitioning tables a new component was added in the form of DeltaWriterBucket that is responsible for handling writes to only one of the buckets (aka partitions). Such bucket writers are managed by DeltaWriter which works as a proxy between higher order frameworks commands (write, prepareCommit etc.) and actual writes' implementation in DeltaWriterBucket. Thanks to this solution events within one DeltaWriter operator received during particular checkpoint interval are always grouped and flushed to the currently opened in-progress file.

The implementation was sourced from the org.apache.flink.connector.file.sink.FileSink that utilizes same concept and implements org.apache.flink.connector.file.sink.writer.FileWriter with its FileWriterBucket implementation. All differences between DeltaSink's and FileSink's writer buckets are explained in particular method's below.

Lifecycle of instances of this class is as follows:

  • Every instance is being created via DeltaWriter#write method whenever writer receives first event that belongs to the bucket represented by given DeltaWriterBucket instance. Or in case of non-partitioned tables whenever writer receives the very first event as in such cases there is only one DeltaWriterBucket representing the root path of the table
  • DeltaWriter instance can create zero, one or multiple instances of DeltaWriterBucket during one checkpoint interval. It creates none if it hasn't received any events (thus didn't have to create buckets for them). It creates one when it has received events belonging only to one bucket (same if the table is not partitioned). Finally, it creates multiple when it has received events belonging to more than one bucket.
  • Life span of one DeltaWriterBucket may hold through one or more checkpoint intervals. It remains "active" as long as it receives data. If e.g. for given checkpoint interval an instance of DeltaWriter hasn't received any events belonging to given bucket, then DeltaWriterBucket representing this bucket is de-listed from the writer's internal bucket's iterator. If in future checkpoint interval given DeltaWriter will receive some more events for given bucket then it will create new instance of DeltaWriterBucket representing this bucket.
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DeltaWriterBucket
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  10. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. def isActive(): Boolean
  12. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  13. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  14. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  15. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  16. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  17. def toString(): String
    Definition Classes
    AnyRef → Any
  18. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  19. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  20. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from AnyRef

Inherited from Any

Ungrouped