public interface ParquetHandler extends FileHandler
| Modifier and Type | Method and Description |
|---|---|
CloseableIterator<FileDataReadResult> |
readParquetFiles(CloseableIterator<FileReadContext> fileIter,
StructType physicalSchema)
Read the Parquet format files at the given locations and return the data as a
ColumnarBatch with the columns requested by physicalSchema. |
contextualizeFileReadsCloseableIterator<FileDataReadResult> readParquetFiles(CloseableIterator<FileReadContext> fileIter, StructType physicalSchema) throws java.io.IOException
ColumnarBatch with the columns requested by physicalSchema.
If physicalSchema has a StructField with column name
StructField.ROW_INDEX_COLUMN_NAME and the field is a metadata column
StructField.isMetadataColumn() the column must be populated with the file row index.fileIter - Iterator of FileReadContext objects to read data from.physicalSchema - Select list of columns to read from the Parquet file.FileDataReadResults containing the data in columnar format
and the corresponding scan file information. It is the responsibility of the caller
to close the iterator. The data returned is in the same as the order of files given
in fileIter.java.io.IOException - if an error occurs during the read.