public interface HoodieTableMetadata extends Serializable, AutoCloseable
| Modifier and Type | Field and Description |
|---|---|
static String |
EMPTY_PARTITION_NAME |
static String |
METADATA_TABLE_NAME_SUFFIX |
static String |
NON_PARTITIONED_NAME |
static String |
RECORDKEY_PARTITION_LIST |
static String |
SOLO_COMMIT_TIMESTAMP
Timestamp for a commit when the base dataset had not had any commits yet.
|
| Modifier and Type | Method and Description |
|---|---|
static HoodieTableMetadata |
create(HoodieEngineContext engineContext,
HoodieStorage storage,
HoodieMetadataConfig metadataConfig,
String datasetBasePath) |
static HoodieTableMetadata |
create(HoodieEngineContext engineContext,
HoodieStorage storage,
HoodieMetadataConfig metadataConfig,
String datasetBasePath,
boolean reuse) |
static FileSystemBackedTableMetadata |
createFSBackedTableMetadata(HoodieEngineContext engineContext,
HoodieStorage storage,
HoodieMetadataConfig metadataConfig,
String datasetBasePath) |
static HoodieBackedTableMetadata |
createHoodieBackedTableMetadata(HoodieEngineContext engineContext,
HoodieStorage storage,
HoodieMetadataConfig metadataConfig,
String datasetBasePath,
boolean reuse) |
List<StoragePathInfo> |
getAllFilesInPartition(StoragePath partitionPath)
Fetch all the files at the given partition path, per the latest snapshot of the metadata.
|
Map<String,List<StoragePathInfo>> |
getAllFilesInPartitions(Collection<String> partitionPaths)
Fetch all files for given partition paths.
|
List<String> |
getAllPartitionPaths()
Fetch list of all partition paths, per the latest snapshot of the metadata.
|
Option<BloomFilter> |
getBloomFilter(String partitionName,
String fileName)
Get the bloom filter for the FileID from the metadata table.
|
Map<Pair<String,String>,BloomFilter> |
getBloomFilters(List<Pair<String,String>> partitionNameFileNameList)
Get bloom filters for files from the metadata table index.
|
Map<Pair<String,String>,HoodieMetadataColumnStats> |
getColumnStats(List<Pair<String,String>> partitionNameFileNameList,
String columnName)
Get column stats for files from the metadata table index.
|
static String |
getDatasetBasePath(String metadataTableBasePath)
Return the base path of the dataset.
|
static String |
getDataTableBasePathFromMetadataTable(String metadataTableBasePath)
Returns the base path of the Dataset provided the base-path of the Metadata Table of this
Dataset
|
Option<String> |
getLatestCompactionTime()
Returns the timestamp of the latest compaction.
|
static StoragePath |
getMetadataTableBasePath(StoragePath dataTableBasePath)
Return the base-path of the Metadata Table for the given Dataset identified by base-path
|
static String |
getMetadataTableBasePath(String dataTableBasePath)
Return the base-path of the Metadata Table for the given Dataset identified by base-path
|
int |
getNumFileGroupsForPartition(MetadataPartitionType partition)
Returns the number of shards in a metadata table partition.
|
List<String> |
getPartitionPathWithPathPrefixes(List<String> relativePathPrefixes)
Fetches all partition paths that are the sub-directories of the list of provided (relative) paths.
|
List<String> |
getPartitionPathWithPathPrefixUsingFilterExpression(List<String> relativePathPrefixes,
Types.RecordType partitionFields,
Expression expression)
Retrieve the paths of partitions under the provided sub-directories,
and try to filter these partitions using the provided
Expression. |
HoodieData<HoodieRecord<HoodieMetadataPayload>> |
getRecordsByKeyPrefixes(List<String> keyPrefixes,
String partitionName,
boolean shouldLoadInMemory)
Fetch records by key prefixes.
|
Option<String> |
getSyncedInstantTime()
Get the instant time to which the metadata is synced w.r.t data timeline.
|
static boolean |
isMetadataTable(String basePath)
Returns
True if the given path contains a metadata table. |
Map<String,HoodieRecordGlobalLocation> |
readRecordIndex(List<String> recordKeys)
Returns the location of record keys which are found in the record index.
|
void |
reset()
Clear the states of the table metadata.
|
closestatic final String METADATA_TABLE_NAME_SUFFIX
static final String SOLO_COMMIT_TIMESTAMP
HoodieTimeline.INIT_INSTANT_TS, such that the metadata table
can be prepped even before bootstrap is done.static final String RECORDKEY_PARTITION_LIST
static final String NON_PARTITIONED_NAME
static final String EMPTY_PARTITION_NAME
static String getMetadataTableBasePath(String dataTableBasePath)
static StoragePath getMetadataTableBasePath(StoragePath dataTableBasePath)
static String getDataTableBasePathFromMetadataTable(String metadataTableBasePath)
static String getDatasetBasePath(String metadataTableBasePath)
metadataTableBasePath - The base path of the metadata tablestatic boolean isMetadataTable(String basePath)
True if the given path contains a metadata table.basePath - The base path to checkstatic HoodieTableMetadata create(HoodieEngineContext engineContext, HoodieStorage storage, HoodieMetadataConfig metadataConfig, String datasetBasePath)
static HoodieTableMetadata create(HoodieEngineContext engineContext, HoodieStorage storage, HoodieMetadataConfig metadataConfig, String datasetBasePath, boolean reuse)
static FileSystemBackedTableMetadata createFSBackedTableMetadata(HoodieEngineContext engineContext, HoodieStorage storage, HoodieMetadataConfig metadataConfig, String datasetBasePath)
static HoodieBackedTableMetadata createHoodieBackedTableMetadata(HoodieEngineContext engineContext, HoodieStorage storage, HoodieMetadataConfig metadataConfig, String datasetBasePath, boolean reuse)
List<StoragePathInfo> getAllFilesInPartition(StoragePath partitionPath) throws IOException
IOExceptionList<String> getPartitionPathWithPathPrefixUsingFilterExpression(List<String> relativePathPrefixes, Types.RecordType partitionFields, Expression expression) throws IOException
Expression.IOExceptionList<String> getPartitionPathWithPathPrefixes(List<String> relativePathPrefixes) throws IOException
E.g., Table has partition 4 partitions: year=2022/month=08/day=30, year=2022/month=08/day=31, year=2022/month=07/day=03, year=2022/month=07/day=04 The relative path "year=2022" returns all partitions, while the relative path "year=2022/month=07" returns only two partitions.
IOExceptionList<String> getAllPartitionPaths() throws IOException
IOExceptionMap<String,List<StoragePathInfo>> getAllFilesInPartitions(Collection<String> partitionPaths) throws IOException
IOExceptionOption<BloomFilter> getBloomFilter(String partitionName, String fileName) throws HoodieMetadataException
partitionName - - Partition namefileName - - File name for which bloom filter needs to be retrievedHoodieMetadataExceptionMap<Pair<String,String>,BloomFilter> getBloomFilters(List<Pair<String,String>> partitionNameFileNameList) throws HoodieMetadataException
partitionNameFileNameList - - List of partition and file name pair for which bloom filters need to be retrievedHoodieMetadataExceptionMap<Pair<String,String>,HoodieMetadataColumnStats> getColumnStats(List<Pair<String,String>> partitionNameFileNameList, String columnName) throws HoodieMetadataException
partitionNameFileNameList - - List of partition and file name pair for which bloom filters need to be retrievedcolumnName - - Column name for which stats are neededHoodieMetadataExceptionMap<String,HoodieRecordGlobalLocation> readRecordIndex(List<String> recordKeys)
HoodieData<HoodieRecord<HoodieMetadataPayload>> getRecordsByKeyPrefixes(List<String> keyPrefixes, String partitionName, boolean shouldLoadInMemory)
keyPrefixes - list of key prefixes for which interested records are looked up for.partitionName - partition name in metadata table where the records are looked up for.HoodieData of HoodieRecords with records matching the passed in key prefixes.Option<String> getSyncedInstantTime()
Option<String> getLatestCompactionTime()
void reset()
int getNumFileGroupsForPartition(MetadataPartitionType partition)
Copyright © 2024 The Apache Software Foundation. All rights reserved.