package parquet
- Alphabetic
- Public
- Protected
Type Members
- class GeoParquetFileFormat extends ParquetFileFormat with GeoParquetFileFormatBase with FileFormat with DataSourceRegister with Logging with Serializable
- class GeoParquetFilters extends AnyRef
Some utility function to convert Spark data source filters to Parquet filters.
- class GeoParquetReadSupport extends ParquetReadSupport with Logging
A Parquet ReadSupport implementation for reading Parquet records as Catalyst InternalRows.
A Parquet ReadSupport implementation for reading Parquet records as Catalyst InternalRows.
The API interface of ReadSupport is a little bit over complicated because of historical reasons. In older versions of parquet-mr (say 1.6.0rc3 and prior), ReadSupport need to be instantiated and initialized twice on both driver side and executor side. The init() method is for driver side initialization, while prepareForRead() is for executor side. However, starting from parquet-mr 1.6.0, it's no longer the case, and ReadSupport is only instantiated and initialized on executor side. So, theoretically, now it's totally fine to combine these two methods into a single initialization method. The only reason (I could think of) to still have them here is for parquet-mr API backwards-compatibility.
Due to this reason, we no longer rely on ReadContext to pass requested schema from init() to prepareForRead(), but use a private
varfor simplicity. - class GeoParquetRecordMaterializer extends RecordMaterializer[InternalRow]
A RecordMaterializer for Catalyst rows.
- class GeoParquetToSparkSchemaConverter extends AnyRef
This converter class is used to convert Parquet MessageType to Spark SQL StructType.
This converter class is used to convert Parquet MessageType to Spark SQL StructType.
Parquet format backwards-compatibility rules are respected when converting Parquet MessageType schemas.
- See also
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md
- class GeoParquetWriteSupport extends WriteSupport[InternalRow] with Logging
A Parquet WriteSupport implementation that writes Catalyst InternalRows as Parquet messages.
A Parquet WriteSupport implementation that writes Catalyst InternalRows as Parquet messages. This class can write Parquet data in two modes:
- Standard mode: Parquet data are written in standard format defined in parquet-format spec.
- Legacy mode: Parquet data are written in legacy format compatible with Spark 1.4 and prior.
This behavior can be controlled by SQL option
spark.sql.parquet.writeLegacyFormat. The value of this option is propagated to this class by theinit()method and its Hadoop configuration argument. - class SparkToGeoParquetSchemaConverter extends SparkToParquetSchemaConverter
This converter class is used to convert Spark SQL StructType to Parquet MessageType.
Value Members
- object GeoDataSourceUtils
- object GeoDateTimeUtils
- object GeoParquetFileFormat extends Logging with Serializable
- object GeoParquetReadSupport extends Logging
- object GeoParquetUtils
- object GeoParquetWriteSupport
- object GeoSchemaMergeUtils