| Class | Description |
|---|---|
| BootstrapBaseFileSplit |
Sub-type of File Split which encapsulates both skeleton and bootstrap base file splits.
|
| BootstrapBaseFileSplit.WrapperFileSplit |
Wrapper for FileSplit just to expose default constructor to the outer class.
|
| BootstrapColumnStichingRecordReader |
Stitches 2 record reader returned rows and presents a concatenated view to clients.
|
| FileStatusWithBootstrapBaseFile |
Sub-Type of File Status tracking both skeleton and bootstrap base file's status.
|
| HiveHoodieTableFileIndex |
Implementation of
BaseHoodieTableFileIndex for Hive-based query engines |
| HoodieColumnProjectionUtils |
Utility functions copied from Hive ColumnProjectionUtils.java.
|
| HoodieCopyOnWriteTableInputFormat |
Base implementation of the Hive's
FileInputFormat allowing for reading of Hudi's
Copy-on-Write (COW) tables in various configurations:
Snapshot mode: reading table's state as of particular timestamp (or instant, in Hudi's terms)
Incremental mode: reading table's state as of particular timestamp (or instant, in Hudi's terms)
External mode: reading non-Hudi partitions
NOTE: This class is invariant of the underlying file-format of the files being read |
| HoodieHFileInputFormat |
HoodieInputFormat for HUDI datasets which store data in HFile base file format.
|
| HoodieHFileRecordReader | |
| HoodieParquetInputFormat |
HoodieInputFormat which understands the Hoodie File Structure and filters files based on the Hoodie Mode.
|
| HoodieParquetInputFormatBase |
!!! PLEASE READ CAREFULLY !!!
NOTE: Hive bears optimizations which are based upon validating whether
FileInputFormat
implementation inherits from MapredParquetInputFormat. |
| HoodieROTablePathFilter |
Given a path is a part of - Hoodie table = accepts ONLY the latest version of each path - Non-Hoodie table = then
always accept
|
| HoodieTableInputFormat |
Abstract base class of the Hive's
FileInputFormat implementations allowing for reading of Hudi's
Copy-on-Write (COW) and Merge-on-Read (MOR) tables |
| InputPathHandler |
InputPathHandler takes in a set of input paths and incremental tables list.
|
| InputSplitUtils | |
| LocatedFileStatusWithBootstrapBaseFile |
Sub-Type of File Status tracking both skeleton and bootstrap base file's status.
|
| PathWithBootstrapFileStatus |
Hacky Workaround !!!
With the base input format implementations in Hadoop/Hive,
we need to encode additional information in Path to track matching external file.
|
| RealtimeFileStatus |
With the base input format implementations in Hadoop/Hive,
we need to encode additional information in Path to track base files and logs files for realtime read.
|
| RecordReaderValueIterator<K,V> |
Provides Iterator Interface to iterate value entries read from record reader.
|
| SafeParquetRecordReaderWrapper |
Record Reader for parquet.
|
| Annotation Type | Description |
|---|---|
| UseFileSplitsFromInputFormat |
When annotated on a InputFormat, informs the query engines, that they should use the FileSplits provided by the input
format to execute the queries.
|
| UseRecordReaderFromInputFormat |
When annotated on a InputFormat, informs the query engines, that they should use the RecordReader provided by the input
format to execute the queries.
|
Copyright © 2022 The Apache Software Foundation. All rights reserved.