class DeltaBucketAssigner[T] extends BucketAssigner[T, String]
Custom implementation of BucketAssigner class required to provide behaviour on how
to map particular events to buckets (aka partitions).
This implementation can be perceived as a utility class for complying to the DeltaLake's
partitioning style (that follows Apache Hive's partitioning style by providing the partitioning
column's and its values as FS directories paths, e.g. "/some_path/table_1/date=2020-01-01")
It's still possible for users to roll out their own version of BucketAssigner
and pass it to the DeltaSinkBuilder during creation of the sink.
This DeltaBucketAssigner is applicable only to DeltaSinkBuilder and not to
RowDataDeltaSinkBuilder. The former lets you use this
DeltaBucketAssigner to provide the required custom bucketing behaviour, while the latter
doesn't expose a custom bucketing API, and you can provide the partition column keys only.
Thus, this DeltaBucketAssigner is currently not exposed to the user through any public
API.
In the future, if you'd like to implement your own custom bucketing...
/////////////////////////////////////////////////////////////////////////////////
// implements a custom partition computer
/////////////////////////////////////////////////////////////////////////////////
static class CustomPartitionColumnComputer implements DeltaPartitionComputer<RowData> {
@Override
public LinkedHashMap<String, String> generatePartitionValues(
RowData element, BucketAssigner.Context context) {
String f1 = element.getString(0).toString();
int f3 = element.getInt(2);
LinkedHashMap<String, String> partitionSpec = new LinkedHashMap<>();
partitionSpec.put("f1", f1);
partitionSpec.put("f3", Integer.toString(f3));
return partitionSpec;
}
}
...
/////////////////////////////////////////
// creates partition assigner for a custom partition computer
/////////////////////////////////////////
DeltaBucketAssignerInternal<RowData> partitionAssigner =
new DeltaBucketAssignerInternal<>(new CustomPartitionColumnComputer());
...
/////////////////////////////////////////////////////////////////////////////////
// create the builder
/////////////////////////////////////////////////////////////////////////////////
DeltaSinkBuilder<RowData></RowData> foo =
new DeltaSinkBuilder.DefaultDeltaFormatBuilder<>(
...,
partitionAssigner,
...)
- Alphabetic
- By Inheritance
- DeltaBucketAssigner
- BucketAssigner
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new DeltaBucketAssigner(partitionComputer: DeltaPartitionComputer[T])
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
getBucketId(element: T, context: Context): String
- Definition Classes
- DeltaBucketAssigner → BucketAssigner
- Annotations
- @Override()
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getSerializer(): SimpleVersionedSerializer[String]
- Definition Classes
- DeltaBucketAssigner → BucketAssigner
- Annotations
- @Override()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- DeltaBucketAssigner → AnyRef → Any
- Annotations
- @Override()
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()