Package io.trino.plugin.hive.parquet
Class ParquetPageSourceFactory
- java.lang.Object
-
- io.trino.plugin.hive.parquet.ParquetPageSourceFactory
-
- All Implemented Interfaces:
HivePageSourceFactory
public class ParquetPageSourceFactory extends Object implements HivePageSourceFactory
-
-
Field Summary
Fields Modifier and Type Field Description static HiveColumnHandlePARQUET_ROW_INDEX_COLUMNIf this object is passed as one of the columns forcreatePageSource, it will be populated as an additional column containing the index of each row read.
-
Constructor Summary
Constructors Constructor Description ParquetPageSourceFactory(HdfsEnvironment hdfsEnvironment, FileFormatDataSourceStats stats, ParquetReaderConfig config, HiveConfig hiveConfig)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Optional<ReaderPageSource>createPageSource(org.apache.hadoop.conf.Configuration configuration, ConnectorSession session, org.apache.hadoop.fs.Path path, long start, long length, long estimatedFileSize, Properties schema, List<HiveColumnHandle> columns, TupleDomain<HiveColumnHandle> effectivePredicate, Optional<AcidInfo> acidInfo, OptionalInt bucketNumber, boolean originalFile, AcidTransaction transaction)static ReaderPageSourcecreatePageSource(org.apache.hadoop.fs.Path path, long start, long length, long estimatedFileSize, List<HiveColumnHandle> columns, TupleDomain<HiveColumnHandle> effectivePredicate, boolean useColumnNames, HdfsEnvironment hdfsEnvironment, org.apache.hadoop.conf.Configuration configuration, ConnectorIdentity identity, org.joda.time.DateTimeZone timeZone, FileFormatDataSourceStats stats, ParquetReaderOptions options)This method is available for other callers to use directly.static Optional<org.apache.parquet.schema.Type>getColumnType(HiveColumnHandle column, org.apache.parquet.schema.MessageType messageType, boolean useParquetColumnNames)static TupleDomain<org.apache.parquet.column.ColumnDescriptor>getParquetTupleDomain(Map<List<String>,RichColumnDescriptor> descriptorsByPath, TupleDomain<HiveColumnHandle> effectivePredicate, org.apache.parquet.schema.MessageType fileSchema, boolean useColumnNames)static Optional<org.apache.parquet.schema.Type>getParquetType(org.apache.parquet.schema.GroupType groupType, boolean useParquetColumnNames, HiveColumnHandle column)
-
-
-
Field Detail
-
PARQUET_ROW_INDEX_COLUMN
public static final HiveColumnHandle PARQUET_ROW_INDEX_COLUMN
If this object is passed as one of the columns forcreatePageSource, it will be populated as an additional column containing the index of each row read.
-
-
Constructor Detail
-
ParquetPageSourceFactory
@Inject public ParquetPageSourceFactory(HdfsEnvironment hdfsEnvironment, FileFormatDataSourceStats stats, ParquetReaderConfig config, HiveConfig hiveConfig)
-
-
Method Detail
-
createPageSource
public Optional<ReaderPageSource> createPageSource(org.apache.hadoop.conf.Configuration configuration, ConnectorSession session, org.apache.hadoop.fs.Path path, long start, long length, long estimatedFileSize, Properties schema, List<HiveColumnHandle> columns, TupleDomain<HiveColumnHandle> effectivePredicate, Optional<AcidInfo> acidInfo, OptionalInt bucketNumber, boolean originalFile, AcidTransaction transaction)
- Specified by:
createPageSourcein interfaceHivePageSourceFactory
-
createPageSource
public static ReaderPageSource createPageSource(org.apache.hadoop.fs.Path path, long start, long length, long estimatedFileSize, List<HiveColumnHandle> columns, TupleDomain<HiveColumnHandle> effectivePredicate, boolean useColumnNames, HdfsEnvironment hdfsEnvironment, org.apache.hadoop.conf.Configuration configuration, ConnectorIdentity identity, org.joda.time.DateTimeZone timeZone, FileFormatDataSourceStats stats, ParquetReaderOptions options)
This method is available for other callers to use directly.
-
getParquetType
public static Optional<org.apache.parquet.schema.Type> getParquetType(org.apache.parquet.schema.GroupType groupType, boolean useParquetColumnNames, HiveColumnHandle column)
-
getColumnType
public static Optional<org.apache.parquet.schema.Type> getColumnType(HiveColumnHandle column, org.apache.parquet.schema.MessageType messageType, boolean useParquetColumnNames)
-
getParquetTupleDomain
public static TupleDomain<org.apache.parquet.column.ColumnDescriptor> getParquetTupleDomain(Map<List<String>,RichColumnDescriptor> descriptorsByPath, TupleDomain<HiveColumnHandle> effectivePredicate, org.apache.parquet.schema.MessageType fileSchema, boolean useColumnNames)
-
-