T - The type of objects written by the constructed ParquetWriter.SELF - The type of this builder that is returned by builder methodspublic abstract static class ParquetWriter.Builder<T,SELF extends ParquetWriter.Builder<T,SELF>> extends Object
| Modifier | Constructor and Description |
|---|---|
protected |
Builder(OutputFile path) |
protected |
Builder(org.apache.hadoop.fs.Path path) |
| Modifier and Type | Method and Description |
|---|---|
ParquetWriter<T> |
build()
Build a
ParquetWriter with the accumulated configuration. |
SELF |
config(String property,
String value)
Set a property that will be available to the read path.
|
SELF |
enableDictionaryEncoding()
Enables dictionary encoding for the constructed writer.
|
SELF |
enablePageWriteChecksum()
Enables writing page level checksums for the constructed writer.
|
SELF |
enableValidation()
Enables validation for the constructed writer.
|
protected abstract WriteSupport<T> |
getWriteSupport(org.apache.hadoop.conf.Configuration conf) |
protected abstract SELF |
self() |
SELF |
withBloomFilterEnabled(boolean enabled)
Sets the bloom filter enabled/disabled
|
SELF |
withBloomFilterEnabled(String columnPath,
boolean enabled)
Sets the bloom filter enabled/disabled for the specified column.
|
SELF |
withBloomFilterNDV(String columnPath,
long ndv)
Sets the NDV (number of distinct values) for the specified column.
|
SELF |
withByteStreamSplitEncoding(boolean enableByteStreamSplit) |
SELF |
withCompressionCodec(org.apache.parquet.hadoop.metadata.CompressionCodecName codecName)
Set the
compression codec used by the
constructed writer. |
SELF |
withConf(org.apache.hadoop.conf.Configuration conf)
Set the
Configuration used by the constructed writer. |
SELF |
withDictionaryEncoding(boolean enableDictionary)
Enable or disable dictionary encoding for the constructed writer.
|
SELF |
withDictionaryEncoding(String columnPath,
boolean enableDictionary)
Enable or disable dictionary encoding of the specified column for the constructed writer.
|
SELF |
withDictionaryPageSize(int dictionaryPageSize)
Set the Parquet format dictionary page size used by the constructed
writer.
|
SELF |
withEncryption(FileEncryptionProperties encryptionProperties)
Set the
file encryption properties used by the
constructed writer. |
SELF |
withMaxPaddingSize(int maxPaddingSize)
Set the maximum amount of padding, in bytes, that will be used to align
row groups with blocks in the underlying filesystem.
|
SELF |
withPageRowCountLimit(int rowCount)
Sets the Parquet format page row count limit used by the constructed writer.
|
SELF |
withPageSize(int pageSize)
Set the Parquet format page size used by the constructed writer.
|
SELF |
withPageWriteChecksumEnabled(boolean enablePageWriteChecksum)
Enables writing page level checksums for the constructed writer.
|
SELF |
withRowGroupSize(int rowGroupSize)
Set the Parquet format row group size used by the constructed writer.
|
SELF |
withValidation(boolean enableValidation)
Enable or disable validation for the constructed writer.
|
SELF |
withWriteMode(ParquetFileWriter.Mode mode)
Set the
write mode used when creating the
backing file for this writer. |
SELF |
withWriterVersion(ParquetProperties.WriterVersion version)
Set the
format version used by the constructed
writer. |
protected Builder(org.apache.hadoop.fs.Path path)
protected Builder(OutputFile path)
protected abstract SELF self()
protected abstract WriteSupport<T> getWriteSupport(org.apache.hadoop.conf.Configuration conf)
conf - a configurationpublic SELF withConf(org.apache.hadoop.conf.Configuration conf)
Configuration used by the constructed writer.conf - a Configurationpublic SELF withWriteMode(ParquetFileWriter.Mode mode)
write mode used when creating the
backing file for this writer.mode - a ParquetFileWriter.Modepublic SELF withCompressionCodec(org.apache.parquet.hadoop.metadata.CompressionCodecName codecName)
compression codec used by the
constructed writer.codecName - a CompressionCodecNamepublic SELF withEncryption(FileEncryptionProperties encryptionProperties)
file encryption properties used by the
constructed writer.encryptionProperties - a FileEncryptionPropertiespublic SELF withRowGroupSize(int rowGroupSize)
rowGroupSize - an integer size in bytespublic SELF withPageSize(int pageSize)
pageSize - an integer size in bytespublic SELF withPageRowCountLimit(int rowCount)
rowCount - limit for the number of rows stored in a pagepublic SELF withDictionaryPageSize(int dictionaryPageSize)
dictionaryPageSize - an integer size in bytespublic SELF withMaxPaddingSize(int maxPaddingSize)
maxPaddingSize - an integer size in bytespublic SELF enableDictionaryEncoding()
public SELF withDictionaryEncoding(boolean enableDictionary)
enableDictionary - whether dictionary encoding should be enabledpublic SELF withByteStreamSplitEncoding(boolean enableByteStreamSplit)
public SELF withDictionaryEncoding(String columnPath, boolean enableDictionary)
columnPath - the path of the column (dot-string)enableDictionary - whether dictionary encoding should be enabledpublic SELF enableValidation()
public SELF withValidation(boolean enableValidation)
enableValidation - whether validation should be enabledpublic SELF withWriterVersion(ParquetProperties.WriterVersion version)
format version used by the constructed
writer.version - a WriterVersionpublic SELF enablePageWriteChecksum()
public SELF withPageWriteChecksumEnabled(boolean enablePageWriteChecksum)
enablePageWriteChecksum - whether page checksums should be written outpublic SELF withBloomFilterNDV(String columnPath, long ndv)
columnPath - the path of the column (dot-string)ndv - the NDV of the columnpublic SELF withBloomFilterEnabled(boolean enabled)
enabled - whether to write bloom filterspublic SELF withBloomFilterEnabled(String columnPath, boolean enabled)
withBloomFilterEnabled(boolean).columnPath - the path of the column (dot-string)enabled - whether to write bloom filter for the columnpublic SELF config(String property, String value)
property - a String property namevalue - a String property valuepublic ParquetWriter<T> build() throws IOException
ParquetWriter with the accumulated configuration.ParquetWriter instance.IOException - if there is an error while creating the writerCopyright © 2021 The Apache Software Foundation. All rights reserved.