class DeltaBulkPartWriter[IN, BucketID] extends AbstractPartFileWriter[IN, BucketID]
This class is an implementation of InProgressFileWriter for writing elements to a part
using BulkPartWriter. This also implements the PartFileInfo.
An instance of this class represents one in-progress files that is currently "opened" by one of
the io.delta.flink.sink.internal.writer.DeltaWriterBucket instance.
It's provided as a workaround for getting actual size of in-progress file right before transitioning it to a pending state ("closing").
The changed behaviour compared to the original BulkPartWriter includes
adding DeltaBulkPartWriter#closeWriter method which is called first during
"close" operation for in-progress file. After calling it we can safely get the
actual file size and then call DeltaBulkPartWriter#closeForCommit() method.
This workaround is needed because for Parquet format the writer's buffer needs
to be explicitly flushed before getting the file size (and there is also no easy why to track
the bytes send to the writer). If such a flush will not be performed then
PartFileInfo#getSize will show file size without considering data buffered in writer's
memory (which in most cases are all the events consumed within given checkpoint interval).
Lifecycle of instances of this class is as follows:
- Since it's a class member of
DeltaInProgressPartit shares its life span as well - Instances of this class are being created inside
io.delta.flink.sink.internal.writer.DeltaWriterBucketmethod every time a bucket processes the first event or if the previously opened file met conditions for rolling (e.g. size threshold) - Its life span holds as long as the underlying file stays in an in-progress state (so until it's "rolled"), but no longer then single checkpoint interval.
- During pre-commit phase every existing
DeltaInProgressPartinstance is automatically transformed ("rolled") into aDeltaPendingFileinstance
This class is almost exact copy of OutputStreamBasedPartFileWriter. The only modified
behaviour is extending DeltaBulkPartWriter#closeWriter() method with flushing of the
internal buffer.
- Alphabetic
- By Inheritance
- DeltaBulkPartWriter
- AbstractPartFileWriter
- InProgressFileWriter
- RecordWiseCompactingFileWriter
- CompactingFileWriter
- PartFileInfo
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new DeltaBulkPartWriter(bucketId: BucketID, currentPartStream: RecoverableFsDataOutputStream, writer: BulkWriter[IN], creationTime: Long)
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
closeForCommit(): PendingFileRecoverable
- Definition Classes
- DeltaBulkPartWriter → InProgressFileWriter → CompactingFileWriter
- Annotations
- @Override()
- def closeWriter(): Unit
-
def
dispose(): Unit
- Definition Classes
- DeltaBulkPartWriter → InProgressFileWriter
- Annotations
- @Override()
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
getBucketId(): BucketID
- Definition Classes
- AbstractPartFileWriter → PartFileInfo
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getCreationTime(): Long
- Definition Classes
- AbstractPartFileWriter → PartFileInfo
-
def
getLastUpdateTime(): Long
- Definition Classes
- AbstractPartFileWriter → PartFileInfo
-
def
getSize(): Long
- Definition Classes
- DeltaBulkPartWriter → PartFileInfo
- Annotations
- @Override()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
markWrite(arg0: Long): Unit
- Attributes
- protected[filesystem]
- Definition Classes
- AbstractPartFileWriter
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
persist(): InProgressFileRecoverable
- Definition Classes
- DeltaBulkPartWriter → InProgressFileWriter
- Annotations
- @Override()
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
write(element: IN, currentTime: Long): Unit
- Definition Classes
- DeltaBulkPartWriter → InProgressFileWriter
- Annotations
- @Override()
-
def
write(arg0: IN): Unit
- Definition Classes
- InProgressFileWriter → RecordWiseCompactingFileWriter
- Annotations
- @throws( classOf[java.io.IOException] )