Packages

class DeltaBulkPartWriter[IN, BucketID] extends AbstractPartFileWriter[IN, BucketID]

This class is an implementation of InProgressFileWriter for writing elements to a part using BulkPartWriter. This also implements the PartFileInfo.

An instance of this class represents one in-progress files that is currently "opened" by one of the io.delta.flink.sink.internal.writer.DeltaWriterBucket instance.

It's provided as a workaround for getting actual size of in-progress file right before transitioning it to a pending state ("closing").

The changed behaviour compared to the original BulkPartWriter includes adding DeltaBulkPartWriter#closeWriter method which is called first during "close" operation for in-progress file. After calling it we can safely get the actual file size and then call DeltaBulkPartWriter#closeForCommit() method.

This workaround is needed because for Parquet format the writer's buffer needs to be explicitly flushed before getting the file size (and there is also no easy why to track the bytes send to the writer). If such a flush will not be performed then PartFileInfo#getSize will show file size without considering data buffered in writer's memory (which in most cases are all the events consumed within given checkpoint interval).

Lifecycle of instances of this class is as follows:

  • Since it's a class member of DeltaInProgressPart it shares its life span as well
  • Instances of this class are being created inside io.delta.flink.sink.internal.writer.DeltaWriterBucket method every time a bucket processes the first event or if the previously opened file met conditions for rolling (e.g. size threshold)
  • Its life span holds as long as the underlying file stays in an in-progress state (so until it's "rolled"), but no longer then single checkpoint interval.
  • During pre-commit phase every existing DeltaInProgressPart instance is automatically transformed ("rolled") into a DeltaPendingFile instance

This class is almost exact copy of OutputStreamBasedPartFileWriter. The only modified behaviour is extending DeltaBulkPartWriter#closeWriter() method with flushing of the internal buffer.

Linear Supertypes
AbstractPartFileWriter[IN, BucketID], InProgressFileWriter[IN, BucketID], RecordWiseCompactingFileWriter[IN], CompactingFileWriter, PartFileInfo[BucketID], AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DeltaBulkPartWriter
  2. AbstractPartFileWriter
  3. InProgressFileWriter
  4. RecordWiseCompactingFileWriter
  5. CompactingFileWriter
  6. PartFileInfo
  7. AnyRef
  8. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DeltaBulkPartWriter(bucketId: BucketID, currentPartStream: RecoverableFsDataOutputStream, writer: BulkWriter[IN], creationTime: Long)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. def closeForCommit(): PendingFileRecoverable
    Definition Classes
    DeltaBulkPartWriter → InProgressFileWriter → CompactingFileWriter
    Annotations
    @Override()
  7. def closeWriter(): Unit
  8. def dispose(): Unit
    Definition Classes
    DeltaBulkPartWriter → InProgressFileWriter
    Annotations
    @Override()
  9. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  10. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  11. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  12. def getBucketId(): BucketID
    Definition Classes
    AbstractPartFileWriter → PartFileInfo
  13. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  14. def getCreationTime(): Long
    Definition Classes
    AbstractPartFileWriter → PartFileInfo
  15. def getLastUpdateTime(): Long
    Definition Classes
    AbstractPartFileWriter → PartFileInfo
  16. def getSize(): Long
    Definition Classes
    DeltaBulkPartWriter → PartFileInfo
    Annotations
    @Override()
  17. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  18. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  19. def markWrite(arg0: Long): Unit
    Attributes
    protected[filesystem]
    Definition Classes
    AbstractPartFileWriter
  20. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  21. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  22. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  23. def persist(): InProgressFileRecoverable
    Definition Classes
    DeltaBulkPartWriter → InProgressFileWriter
    Annotations
    @Override()
  24. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  25. def toString(): String
    Definition Classes
    AnyRef → Any
  26. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  27. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  28. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  29. def write(element: IN, currentTime: Long): Unit
    Definition Classes
    DeltaBulkPartWriter → InProgressFileWriter
    Annotations
    @Override()
  30. def write(arg0: IN): Unit
    Definition Classes
    InProgressFileWriter → RecordWiseCompactingFileWriter
    Annotations
    @throws( classOf[java.io.IOException] )

Inherited from AbstractPartFileWriter[IN, BucketID]

Inherited from InProgressFileWriter[IN, BucketID]

Inherited from RecordWiseCompactingFileWriter[IN]

Inherited from CompactingFileWriter

Inherited from PartFileInfo[BucketID]

Inherited from AnyRef

Inherited from Any

Ungrouped