public interface JsonHandler extends FileHandler
Row or read content from JSON files.
Connectors can leverage this interface to provide their best implementation of the JSON parsing
capability to
Delta Kernel.| Modifier and Type | Method and Description |
|---|---|
ColumnarBatch |
parseJson(ColumnVector jsonStringVector,
StructType outputSchema)
Parse the given json strings and return the fields requested by
outputSchema
as columns in a ColumnarBatch. |
CloseableIterator<FileDataReadResult> |
readJsonFiles(CloseableIterator<FileReadContext> fileIter,
StructType physicalSchema)
Read and parse the JSON format file at given locations and return the data as a
ColumnarBatch with the columns requested by physicalSchema. |
contextualizeFileReadsColumnarBatch parseJson(ColumnVector jsonStringVector, StructType outputSchema)
outputSchema
as columns in a ColumnarBatch.jsonStringVector - String ColumnVector of valid JSON strings.outputSchema - Schema of the data to return from the parsed JSON. If any requested
fields are missing in the JSON string, a null is returned for that
particular field in the returned Row. The type for each given
field is expected to match the type in the JSON.ColumnarBatch of schema outputSchema with one row for each entry
in jsonStringVectorCloseableIterator<FileDataReadResult> readJsonFiles(CloseableIterator<FileReadContext> fileIter, StructType physicalSchema) throws java.io.IOException
ColumnarBatch with the columns requested by physicalSchema.fileIter - Iterator of FileReadContext objects to read data from.physicalSchema - Select list of columns to read from the JSON file.FileDataReadResults containing the data in columnar format
and the corresponding scan file information. It is the responsibility of the caller
to close the iterator. The data returned is in the same as the order of files given
in fileIter.java.io.IOException - if an error occurs during the read.