public class UnmaterializableRecordCounter extends Object
These types of errors are meant to be recoverable record conversion errors, such as a union missing a value, or schema mismatch and so on. It's not meant to recover from corruptions in the parquet columns themselves.
The intention is to skip over very rare file corruption or bugs where the write path has allowed invalid records into the file, but still catch large numbers of failures. Not turned on by default (by default, no errors are tolerated).
| Modifier and Type | Field and Description |
|---|---|
static String |
BAD_RECORD_THRESHOLD_CONF_KEY |
| Constructor and Description |
|---|
UnmaterializableRecordCounter(org.apache.hadoop.conf.Configuration conf,
long totalNumRecords) |
UnmaterializableRecordCounter(double errorThreshold,
long totalNumRecords) |
UnmaterializableRecordCounter(ParquetReadOptions options,
long totalNumRecords) |
| Modifier and Type | Method and Description |
|---|---|
void |
incErrors(RecordMaterializer.RecordMaterializationException cause) |
public static final String BAD_RECORD_THRESHOLD_CONF_KEY
public UnmaterializableRecordCounter(org.apache.hadoop.conf.Configuration conf,
long totalNumRecords)
public UnmaterializableRecordCounter(ParquetReadOptions options, long totalNumRecords)
public UnmaterializableRecordCounter(double errorThreshold,
long totalNumRecords)
public void incErrors(RecordMaterializer.RecordMaterializationException cause) throws ParquetDecodingException
ParquetDecodingExceptionCopyright © 2023 The Apache Software Foundation. All rights reserved.