public class AvroParquetOutputFormat<T> extends ParquetOutputFormat<T>
OutputFormat for Parquet files.ParquetOutputFormat.JobSummaryLevelADAPTIVE_BLOOM_FILTER_ENABLED, BLOCK_SIZE, BLOOM_FILTER_CANDIDATES_NUMBER, BLOOM_FILTER_ENABLED, BLOOM_FILTER_EXPECTED_NDV, BLOOM_FILTER_FPP, BLOOM_FILTER_MAX_BYTES, COLUMN_INDEX_TRUNCATE_LENGTH, COMPRESSION, DICTIONARY_PAGE_SIZE, ENABLE_DICTIONARY, ENABLE_JOB_SUMMARY, ESTIMATE_PAGE_SIZE_CHECK, JOB_SUMMARY_LEVEL, MAX_PADDING_BYTES, MAX_ROW_COUNT_FOR_PAGE_SIZE_CHECK, MEMORY_POOL_RATIO, MIN_MEMORY_ALLOCATION, MIN_ROW_COUNT_FOR_PAGE_SIZE_CHECK, PAGE_ROW_COUNT_LIMIT, PAGE_SIZE, PAGE_VALUE_COUNT_THRESHOLD, PAGE_WRITE_CHECKSUM_ENABLED, SIZE_STATISTICS_ENABLED, STATISTICS_ENABLED, STATISTICS_TRUNCATE_LENGTH, VALIDATION, WRITE_SUPPORT_CLASS, WRITER_VERSION| Constructor and Description |
|---|
AvroParquetOutputFormat() |
| Modifier and Type | Method and Description |
|---|---|
static void |
setAvroDataSupplier(org.apache.hadoop.mapreduce.Job job,
Class<? extends AvroDataSupplier> supplierClass)
Sets the
AvroDataSupplier class that will be used. |
static void |
setSchema(org.apache.hadoop.mapreduce.Job job,
org.apache.avro.Schema schema)
Set the Avro schema to use for writing.
|
createEncryptionProperties, getAdaptiveBloomFilterEnabled, getBlockSize, getBlockSize, getBloomFilterEnabled, getBloomFilterMaxBytes, getCompression, getCompression, getDictionaryPageSize, getDictionaryPageSize, getEnableDictionary, getEnableDictionary, getEstimatePageSizeCheck, getJobSummaryLevel, getLongBlockSize, getMaxRowCountForPageSizeCheck, getMemoryManager, getMinRowCountForPageSizeCheck, getOutputCommitter, getPageSize, getPageSize, getPageWriteChecksumEnabled, getRecordWriter, getRecordWriter, getRecordWriter, getRecordWriter, getRecordWriter, getRecordWriter, getSizeStatisticsEnabled, getSizeStatisticsEnabled, getStatisticsEnabled, getStatisticsEnabled, getValidation, getValidation, getValueCountThreshold, getWriterVersion, getWriteSupport, getWriteSupportClass, isCompressionSet, isCompressionSet, setBlockSize, setColumnIndexTruncateLength, setColumnIndexTruncateLength, setCompression, setDictionaryPageSize, setEnableDictionary, setMaxPaddingSize, setMaxPaddingSize, setPageRowCountLimit, setPageRowCountLimit, setPageSize, setPageWriteChecksumEnabled, setPageWriteChecksumEnabled, setSizeStatisticsEnabled, setSizeStatisticsEnabled, setStatisticsEnabled, setStatisticsEnabled, setStatisticsTruncateLength, setValidation, setValidation, setWriteSupportClass, setWriteSupportClasscheckOutputSpecs, getCompressOutput, getDefaultWorkFile, getOutputCompressorClass, getOutputName, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputName, setOutputPathpublic static void setSchema(org.apache.hadoop.mapreduce.Job job,
org.apache.avro.Schema schema)
job - a jobschema - a schema for the data that will be writtenAvroParquetInputFormat.setAvroReadSchema(org.apache.hadoop.mapreduce.Job, org.apache.avro.Schema)public static void setAvroDataSupplier(org.apache.hadoop.mapreduce.Job job,
Class<? extends AvroDataSupplier> supplierClass)
AvroDataSupplier class that will be used. The data
supplier provides instances of GenericData
that are used to deconstruct records.job - a Job to configuresupplierClass - a supplier classCopyright © 2024 The Apache Software Foundation. All rights reserved.