public class AcidUtils extends Object
| Modifier and Type | Class and Description |
|---|---|
static class |
AcidUtils.AcidBaseFileType |
static class |
AcidUtils.AcidOperationalProperties
Current syntax for creating full acid transactional tables is any one of following 3 ways:
create table T (a int, b int) stored as orc tblproperties('transactional'='true').
|
static class |
AcidUtils.AnyIdDirFilter |
static class |
AcidUtils.BucketMetaData
Represents bucketId and copy_N suffix
|
static interface |
AcidUtils.Directory |
static class |
AcidUtils.FileInfo
A simple wrapper class that stores the information about a base file and its type.
|
static interface |
AcidUtils.HdfsDirSnapshot
DFS dir listing.
|
static class |
AcidUtils.HdfsDirSnapshotImpl |
static class |
AcidUtils.IdFullPathFiler
Full recursive PathFilter version of IdPathFilter (filtering files for a given writeId and stmtId).
|
static class |
AcidUtils.IdPathFilter |
static class |
AcidUtils.MetaDataFile
General facility to place a metadata file into a dir created by acid/compactor write.
|
static class |
AcidUtils.Operation |
static class |
AcidUtils.OrcAcidVersion
Logic related to versioning acid data format.
|
static class |
AcidUtils.ParsedBase
In addition to
AcidUtils.ParsedBaseLight this knows if the data is in raw format, i.e. |
static class |
AcidUtils.ParsedBaseLight
Since version 3 but prior to version 4, format of a base is "base_X" where X is a writeId.
|
static class |
AcidUtils.ParsedDelta
In addition to
AcidUtils.ParsedDeltaLight this knows if the data is in raw format, i.e. |
static class |
AcidUtils.ParsedDeltaLight
This encapsulates info obtained form the file path.
|
static interface |
AcidUtils.ParsedDirectory |
static class |
AcidUtils.TableSnapshot |
| Modifier and Type | Field and Description |
|---|---|
static org.apache.hadoop.fs.PathFilter |
acidHiddenFileFilter |
static org.apache.hadoop.fs.PathFilter |
acidTempDirFilter |
static String |
BASE_PREFIX |
static org.apache.hadoop.fs.PathFilter |
baseFileFilter |
static String |
BUCKET_DIGITS |
static Pattern |
BUCKET_PATTERN |
static String |
BUCKET_PREFIX |
static org.apache.hadoop.fs.PathFilter |
bucketFileFilter |
static String |
COMPACTOR_TABLE_PROPERTY |
static String |
CONF_ACID_KEY |
static String |
DELETE_DELTA_PREFIX |
static org.apache.hadoop.fs.PathFilter |
deleteEventDeltaDirFilter |
static String |
DELTA_DIGITS |
static String |
DELTA_PREFIX |
static String |
DELTA_SIDE_FILE_SUFFIX
Acid Streaming Ingest writes multiple transactions to the same file.
|
static org.apache.hadoop.fs.PathFilter |
deltaFileFilter |
static Pattern |
LEGACY_BUCKET_DIGIT_PATTERN |
static String |
LEGACY_FILE_BUCKET_DIGITS |
static int |
MAX_STATEMENTS_PER_TXN
This must be in sync with
STATEMENT_DIGITS |
static Pattern |
ORIGINAL_PATTERN |
static Pattern |
ORIGINAL_PATTERN_COPY |
static org.apache.hadoop.fs.PathFilter |
originalBucketFilter
A write into a non-aicd table produces files like 0000_0 or 0000_0_copy_1
(Unless via Load Data statement)
|
static String |
STATEMENT_DIGITS
10K statements per tx.
|
static Pattern |
VISIBILITY_PATTERN |
static String |
VISIBILITY_PREFIX |
| Modifier and Type | Method and Description |
|---|---|
static boolean |
acidTableWithoutTransactions(Table table) |
static String |
addVisibilitySuffix(String baseOrDeltaDir,
long visibilityTxnId)
Since Hive 4.0, compactor produces directories with
VISIBILITY_PATTERN suffix. |
static String |
baseDir(long writeId) |
static String |
baseOrDeltaSubdir(boolean baseDirRequired,
long min,
long max,
int statementId)
Return a base or delta directory string
according to the given "baseDirRequired".
|
static org.apache.hadoop.fs.Path |
baseOrDeltaSubdirPath(org.apache.hadoop.fs.Path directory,
AcidOutputFormat.Options options)
Return a base or delta directory path according to the given "options".
|
static boolean |
canBeMadeAcid(String fullTableName,
org.apache.hadoop.hive.metastore.api.StorageDescriptor sd) |
static CompactionState |
compactionStateStr2Enum(String inputValue) |
static org.apache.hadoop.hive.metastore.api.CompactionType |
compactionTypeStr2ThriftType(String inputValue) |
static org.apache.hadoop.fs.Path |
createBucketFile(org.apache.hadoop.fs.Path subdir,
int bucket)
Create the bucket filename in Acid format
|
static org.apache.hadoop.fs.Path |
createBucketFile(org.apache.hadoop.fs.Path subdir,
int bucket,
Integer attemptId) |
static org.apache.hadoop.fs.Path |
createFilename(org.apache.hadoop.fs.Path directory,
AcidOutputFormat.Options options)
Create a filename for a bucket file.
|
static String |
deleteDeltaSubdir(long min,
long max)
This is format of delete delta dir name prior to Hive 2.2.x
|
static String |
deleteDeltaSubdir(long min,
long max,
int statementId)
Each write statement in a transaction creates its own delete delta dir,
when split-update acid operational property is turned on.
|
static String |
deltaSubdir(long min,
long max)
This is format of delta dir name prior to Hive 1.3.x
|
static String |
deltaSubdir(long min,
long max,
int statementId)
Each write statement in a transaction creates its own delta dir.
|
static org.apache.hadoop.fs.Path[] |
deserializeDeleteDeltas(org.apache.hadoop.fs.Path root,
List<AcidInputFormat.DeltaMetaData> deleteDeltas,
Map<String,AcidInputFormat.DeltaMetaData> pathToDeltaMetaData)
Convert the list of begin/end write id pairs to a list of delete delta
directories.
|
static Long |
extractWriteId(org.apache.hadoop.fs.Path file) |
static List<org.apache.hadoop.fs.FileStatus> |
getAcidFilesForStats(Table table,
org.apache.hadoop.fs.Path dir,
org.apache.hadoop.conf.Configuration jc,
org.apache.hadoop.fs.FileSystem fs) |
static AcidUtils.AcidOperationalProperties |
getAcidOperationalProperties(org.apache.hadoop.conf.Configuration conf)
Returns the acidOperationalProperties for a given configuration.
|
static AcidUtils.AcidOperationalProperties |
getAcidOperationalProperties(Map<String,String> parameters)
Returns the acidOperationalProperties for a given map.
|
static AcidUtils.AcidOperationalProperties |
getAcidOperationalProperties(Properties props)
Returns the acidOperationalProperties for a given set of properties.
|
static AcidUtils.AcidOperationalProperties |
getAcidOperationalProperties(Table table)
Returns the acidOperationalProperties for a given table.
|
static AcidDirectory |
getAcidState(org.apache.hadoop.fs.FileSystem fileSystem,
org.apache.hadoop.fs.Path candidateDirectory,
org.apache.hadoop.conf.Configuration conf,
ValidWriteIdList writeIdList,
Ref<Boolean> useFileIds,
boolean ignoreEmptyFiles)
Get the ACID state of the given directory.
|
static AcidDirectory |
getAcidState(org.apache.hadoop.fs.FileSystem fileSystem,
org.apache.hadoop.fs.Path candidateDirectory,
org.apache.hadoop.conf.Configuration conf,
ValidWriteIdList writeIdList,
Ref<Boolean> useFileIds,
boolean ignoreEmptyFiles,
Map<org.apache.hadoop.fs.Path,AcidUtils.HdfsDirSnapshot> dirSnapshots)
GetAcidState implementation which uses the provided dirSnapshot.
|
static AcidDirectory |
getAcidStateFromCache(Supplier<org.apache.hadoop.fs.FileSystem> fileSystem,
org.apache.hadoop.fs.Path candidateDirectory,
org.apache.hadoop.conf.Configuration conf,
ValidWriteIdList writeIdList,
Ref<Boolean> useFileIds,
boolean ignoreEmptyFiles)
Tries to get directory details from cache.
|
static String |
getAcidSubDir(org.apache.hadoop.fs.Path dataPath) |
static List<VirtualColumn> |
getAcidVirtualColumns(Table table)
Returns the virtual columns needed for update queries.
|
static Map<String,Integer> |
getDeltaToAttemptIdMap(Map<String,AcidInputFormat.DeltaMetaData> pathToDeltaMetaData,
org.apache.hadoop.fs.Path[] deleteDeltaDirs,
int bucket)
If the direct insert is on for ACID tables, the files will contain an "_attemptID" postfix.
|
static String |
getFirstLevelAcidDirPath(org.apache.hadoop.fs.Path dataPath,
org.apache.hadoop.fs.FileSystem fileSystem) |
static String |
getFullTableName(String dbName,
String tableName) |
static Map<org.apache.hadoop.fs.Path,AcidUtils.HdfsDirSnapshot> |
getHdfsDirSnapshots(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path path) |
static Map<org.apache.hadoop.fs.Path,AcidUtils.HdfsDirSnapshot> |
getHdfsDirSnapshotsForCleaner(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path path)
In case of the cleaner, we don't need to go into file level, it is enough to collect base/delta/deletedelta directories.
|
static long |
getLogicalLength(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.FileStatus file)
See comments at
DELTA_SIDE_FILE_SUFFIX. |
static String |
getPartitionName(Map<String,String> partitionSpec) |
static org.apache.hadoop.fs.Path[] |
getPaths(List<AcidUtils.ParsedDelta> deltas)
Convert a list of deltas to a list of delta directories.
|
static String |
getPathSuffix(long txnId) |
static AcidUtils.TableSnapshot |
getTableSnapshot(org.apache.hadoop.conf.Configuration conf,
Table tbl) |
static AcidUtils.TableSnapshot |
getTableSnapshot(org.apache.hadoop.conf.Configuration conf,
Table tbl,
boolean isStatsUpdater)
Note: this is generally called in Hive.java; so, the callers of Hive.java make sure
to set up the acid state during compile, and Hive.java retrieves it if needed.
|
static AcidUtils.TableSnapshot |
getTableSnapshot(org.apache.hadoop.conf.Configuration conf,
Table tbl,
String dbName,
String tblName,
boolean isStatsUpdater)
Note: this is generally called in Hive.java; so, the callers of Hive.java make sure
to set up the acid state during compile, and Hive.java retrieves it if needed.
|
static ValidWriteIdList |
getTableValidWriteIdList(org.apache.hadoop.conf.Configuration conf,
String fullTableName)
Extract the ValidWriteIdList for the given table from the list of tables' ValidWriteIdList.
|
static ValidWriteIdList |
getTableValidWriteIdListWithTxnList(org.apache.hadoop.conf.Configuration conf,
String dbName,
String tableName)
Returns ValidWriteIdList for the table with the given "dbName" and "tableName".
|
static org.apache.hadoop.hive.metastore.api.TxnType |
getTxnType(org.apache.hadoop.conf.Configuration conf,
ASTNode tree)
Determines transaction type based on query AST.
|
static List<org.apache.hadoop.fs.Path> |
getValidDataPaths(org.apache.hadoop.fs.Path dataPath,
org.apache.hadoop.conf.Configuration conf,
String validWriteIdStr) |
static ValidTxnWriteIdList |
getValidTxnWriteIdList(org.apache.hadoop.conf.Configuration conf)
Get the ValidTxnWriteIdList saved in the configuration.
|
static void |
initDirCache(int durationInMts) |
static boolean |
isAcid(org.apache.hadoop.fs.FileSystem fileSystem,
org.apache.hadoop.fs.Path directory,
org.apache.hadoop.conf.Configuration conf) |
static boolean |
isAcid(org.apache.hadoop.fs.Path directory,
org.apache.hadoop.conf.Configuration conf)
Is the given directory in ACID format?
|
static boolean |
isAcidEnabled(HiveConf hiveConf) |
static boolean |
isChildOfDelta(org.apache.hadoop.fs.Path childDir,
org.apache.hadoop.fs.Path rootPath) |
static boolean |
isCompactionTable(Map<String,String> parameters)
Determine if a table is used during query based compaction.
|
static boolean |
isCompactionTable(Properties tblProperties)
Determine if a table is used during query based compaction.
|
static boolean |
isDeleteDelta(org.apache.hadoop.fs.Path p) |
static boolean |
isExclusiveCTAS(Set<WriteEntity> outputs,
HiveConf conf) |
static boolean |
isExclusiveCTASEnabled(org.apache.hadoop.conf.Configuration conf) |
static boolean |
isFullAcidScan(org.apache.hadoop.conf.Configuration conf) |
static boolean |
isFullAcidTable(CreateTableDesc td) |
static boolean |
isFullAcidTable(Map<String,String> params) |
static boolean |
isFullAcidTable(Table table)
Should produce the same result as
TxnUtils.isAcidTable(org.apache.hadoop.hive.metastore.api.Table) |
static boolean |
isFullAcidTable(org.apache.hadoop.hive.metastore.api.Table table)
Should produce the same result as
TxnUtils.isAcidTable(org.apache.hadoop.hive.metastore.api.Table) |
static boolean |
isInsertDelta(org.apache.hadoop.fs.Path p) |
static boolean |
isInsertOnlyFetchBucketId(org.apache.hadoop.conf.Configuration conf) |
static boolean |
isInsertOnlyTable(Map<String,String> params)
Checks if a table is a transactional table that only supports INSERT, but not UPDATE/DELETE
|
static boolean |
isInsertOnlyTable(Properties params) |
static boolean |
isInsertOnlyTable(Table table) |
static boolean |
isLocklessReadsEnabled(Table table,
HiveConf conf) |
static boolean |
isNonNativeAcidTable(Table table) |
static boolean |
isRemovedInsertOnlyTable(Set<String> removedSet) |
static boolean |
isTablePropertyTransactional(Map<String,String> parameters) |
static boolean |
isTablePropertyTransactional(Properties props) |
static boolean |
isTableSoftDeleteEnabled(Table table,
HiveConf conf) |
static Boolean |
isToFullAcid(Table table,
Map<String,String> props) |
static Boolean |
isToInsertOnlyTable(Table tbl,
Map<String,String> props)
The method for altering table props; may set the table to MM, non-MM, or not affect MM.
|
static boolean |
isTransactionalTable(CreateTableDesc table) |
static boolean |
isTransactionalTable(Map<String,String> props) |
static boolean |
isTransactionalTable(Table table) |
static boolean |
isTransactionalTable(org.apache.hadoop.hive.metastore.api.Table table) |
static boolean |
isTransactionalView(CreateMaterializedViewDesc view) |
static List<org.apache.hadoop.hive.metastore.api.LockComponent> |
makeLockComponents(Set<WriteEntity> outputs,
Set<ReadEntity> inputs,
Context.Operation operation,
HiveConf conf)
Create lock components from write/read entities.
|
static Integer |
parseAttemptId(org.apache.hadoop.fs.Path bucketFile) |
static AcidOutputFormat.Options |
parseBaseOrDeltaBucketFilename(org.apache.hadoop.fs.Path bucketFile,
org.apache.hadoop.conf.Configuration conf)
Parse a bucket filename back into the options that would have created
the file.
|
static int |
parseBucketId(org.apache.hadoop.fs.Path bucketFile)
Get the bucket id from the file path
|
static AcidUtils.ParsedDelta |
parsedDelta(org.apache.hadoop.fs.Path deltaDir,
org.apache.hadoop.fs.FileSystem fs)
This will look at a footer of one of the files in the delta to see if the
file is in Acid format, i.e.
|
static List<AcidInputFormat.DeltaMetaData> |
serializeDeleteDeltas(List<AcidUtils.ParsedDelta> deltas,
org.apache.hadoop.fs.FileSystem fs)
Convert the list of deltas into an equivalent list of begin/end
write id pairs.
|
static void |
setAcidOperationalProperties(org.apache.hadoop.conf.Configuration conf,
boolean isTxnTable,
AcidUtils.AcidOperationalProperties properties)
Sets the acidOperationalProperties in the configuration object argument.
|
static void |
setAcidOperationalProperties(Map<String,String> parameters,
boolean isTxnTable,
AcidUtils.AcidOperationalProperties properties)
Sets the acidOperationalProperties in the map object argument.
|
static void |
setNonTransactional(Map<String,String> tblProps) |
static void |
setValidWriteIdList(org.apache.hadoop.conf.Configuration conf,
TableScanDesc tsDesc)
Set the valid write id list for the current table scan.
|
static void |
setValidWriteIdList(org.apache.hadoop.conf.Configuration conf,
ValidWriteIdList validWriteIds)
Set the valid write id list for the current table scan.
|
static org.apache.hadoop.hive.metastore.api.DataOperationType |
toDataOperationType(AcidUtils.Operation op)
Logically this should have been defined in Operation but that causes a dependency
on metastore package from exec jar (from the cluster) which is not allowed.
|
static void |
tryInvalidateDirCache(org.apache.hadoop.hive.metastore.api.Table table) |
static void |
validateAcidFiles(Table table,
org.apache.hadoop.fs.FileStatus[] srcs,
org.apache.hadoop.fs.FileSystem fs)
Safety check to make sure a file take from one acid table is not added into another acid table
since the ROW__IDs embedded as part a write to one table won't make sense in different
table/cluster.
|
static void |
validateAcidPartitionLocation(String location,
org.apache.hadoop.conf.Configuration conf)
Safety check to make sure the given location is not the location of acid table and
all it's files will be not added into another acid table
|
public static final String CONF_ACID_KEY
public static final String BASE_PREFIX
public static final String COMPACTOR_TABLE_PROPERTY
public static final org.apache.hadoop.fs.PathFilter baseFileFilter
public static final String DELTA_PREFIX
public static final String DELETE_DELTA_PREFIX
public static final String DELTA_SIDE_FILE_SUFFIX
OrcAcidUtils.getSideFile(Path) side file which stores the length of
the primary file as of the last commit (OrcRecordUpdater.flush()). That is the 'logical length'.
Once the primary is closed, the side file is deleted (logical length = actual length) but if
the writer dies or the primary file is being read while its still being written to, anything
past the logical length should be ignored.OrcAcidUtils.DELTA_SIDE_FILE_SUFFIX,
OrcAcidUtils.getLastFlushLength(FileSystem, Path),
getLogicalLength(FileSystem, FileStatus),
Constant Field Valuespublic static final org.apache.hadoop.fs.PathFilter deltaFileFilter
public static final org.apache.hadoop.fs.PathFilter deleteEventDeltaDirFilter
public static final String BUCKET_PREFIX
public static final org.apache.hadoop.fs.PathFilter bucketFileFilter
public static final String BUCKET_DIGITS
public static final String LEGACY_FILE_BUCKET_DIGITS
public static final String DELTA_DIGITS
public static final String STATEMENT_DIGITS
public static final int MAX_STATEMENTS_PER_TXN
STATEMENT_DIGITSpublic static final Pattern LEGACY_BUCKET_DIGIT_PATTERN
public static final Pattern BUCKET_PATTERN
public static final org.apache.hadoop.fs.PathFilter originalBucketFilter
public static final Pattern ORIGINAL_PATTERN
public static final Pattern ORIGINAL_PATTERN_COPY
Utilities.COPY_KEYWORDpublic static final org.apache.hadoop.fs.PathFilter acidHiddenFileFilter
public static final org.apache.hadoop.fs.PathFilter acidTempDirFilter
public static final String VISIBILITY_PREFIX
public static final Pattern VISIBILITY_PATTERN
public static org.apache.hadoop.fs.Path createBucketFile(org.apache.hadoop.fs.Path subdir,
int bucket)
subdir - the subdirectory for the bucket.bucket - the bucket numberpublic static org.apache.hadoop.fs.Path createBucketFile(org.apache.hadoop.fs.Path subdir,
int bucket,
Integer attemptId)
public static String deltaSubdir(long min, long max)
public static String deltaSubdir(long min, long max, int statementId)
public static String deleteDeltaSubdir(long min, long max)
public static String deleteDeltaSubdir(long min, long max, int statementId)
public static String baseDir(long writeId)
public static String baseOrDeltaSubdir(boolean baseDirRequired, long min, long max, int statementId)
public static org.apache.hadoop.fs.Path baseOrDeltaSubdirPath(org.apache.hadoop.fs.Path directory,
AcidOutputFormat.Options options)
public static org.apache.hadoop.fs.Path createFilename(org.apache.hadoop.fs.Path directory,
AcidOutputFormat.Options options)
directory - the partition directoryoptions - the options for writing the bucketpublic static String addVisibilitySuffix(String baseOrDeltaDir, long visibilityTxnId)
VISIBILITY_PATTERN suffix.
_v0 is equivalent to no suffix, for backwards compatibility.public static boolean isCompactionTable(Properties tblProperties)
tblProperties - table propertiesCOMPACTOR_TABLE_PROPERTYpublic static boolean isCompactionTable(Map<String,String> parameters)
parameters - table properties mapCOMPACTOR_TABLE_PROPERTYpublic static int parseBucketId(org.apache.hadoop.fs.Path bucketFile)
bucketFile - - bucket file pathpublic static Integer parseAttemptId(org.apache.hadoop.fs.Path bucketFile)
public static AcidOutputFormat.Options parseBaseOrDeltaBucketFilename(org.apache.hadoop.fs.Path bucketFile, org.apache.hadoop.conf.Configuration conf)
bucketFile - the path to a bucket fileconf - the configurationpublic static Map<String,Integer> getDeltaToAttemptIdMap(Map<String,AcidInputFormat.DeltaMetaData> pathToDeltaMetaData, org.apache.hadoop.fs.Path[] deleteDeltaDirs, int bucket)
pathToDeltaMetaData - deleteDeltaDirs - bucket - public static org.apache.hadoop.hive.metastore.api.DataOperationType toDataOperationType(AcidUtils.Operation op)
public static org.apache.hadoop.fs.Path[] getPaths(List<AcidUtils.ParsedDelta> deltas)
deltas - the list of deltas out of a Directory object.public static List<AcidInputFormat.DeltaMetaData> serializeDeleteDeltas(List<AcidUtils.ParsedDelta> deltas, org.apache.hadoop.fs.FileSystem fs) throws IOException
deltas is sorted.deltas - sorted delete delta listfs - FileSystemIOException - expublic static org.apache.hadoop.fs.Path[] deserializeDeleteDeltas(org.apache.hadoop.fs.Path root,
List<AcidInputFormat.DeltaMetaData> deleteDeltas,
Map<String,AcidInputFormat.DeltaMetaData> pathToDeltaMetaData)
deltaSubdir(long, long, int)root - the root directorydeleteDeltas - list of begin/end write id pairspublic static AcidUtils.ParsedDelta parsedDelta(org.apache.hadoop.fs.Path deltaDir, org.apache.hadoop.fs.FileSystem fs) throws IOException
IOExceptionpublic static boolean isAcid(org.apache.hadoop.fs.Path directory,
org.apache.hadoop.conf.Configuration conf)
throws IOException
directory - the partition directory to checkconf - the query configurationIOExceptionpublic static boolean isAcid(org.apache.hadoop.fs.FileSystem fileSystem,
org.apache.hadoop.fs.Path directory,
org.apache.hadoop.conf.Configuration conf)
throws IOException
IOExceptionpublic static AcidDirectory getAcidState(org.apache.hadoop.fs.FileSystem fileSystem, org.apache.hadoop.fs.Path candidateDirectory, org.apache.hadoop.conf.Configuration conf, ValidWriteIdList writeIdList, Ref<Boolean> useFileIds, boolean ignoreEmptyFiles) throws IOException
fileSystem - optional, it it is not provided, it will be derived from the candidateDirectorycandidateDirectory - the partition directory to analyzeconf - the configurationwriteIdList - the list of write ids that we are readinguseFileIds - It will be set to true, if the FileSystem supports listing with fileIdsignoreEmptyFiles - Ignore files with 0 lengthIOException - on filesystem errorspublic static AcidDirectory getAcidState(org.apache.hadoop.fs.FileSystem fileSystem, org.apache.hadoop.fs.Path candidateDirectory, org.apache.hadoop.conf.Configuration conf, ValidWriteIdList writeIdList, Ref<Boolean> useFileIds, boolean ignoreEmptyFiles, Map<org.apache.hadoop.fs.Path,AcidUtils.HdfsDirSnapshot> dirSnapshots) throws IOException
fileSystem - optional, it it is not provided, it will be derived from the candidateDirectorycandidateDirectory - the partition directory to analyzeconf - the configurationwriteIdList - the list of write ids that we are readinguseFileIds - It will be set to true, if the FileSystem supports listing with fileIdsignoreEmptyFiles - Ignore files with 0 lengthdirSnapshots - The listed directory snapshot, if null new will be generatedIOException - on filesystem errorspublic static Map<org.apache.hadoop.fs.Path,AcidUtils.HdfsDirSnapshot> getHdfsDirSnapshotsForCleaner(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) throws IOException
fs - the filesystem used for the directory lookuppath - the path of the table or partition needs to be cleanedIOException - on filesystem errorspublic static Map<org.apache.hadoop.fs.Path,AcidUtils.HdfsDirSnapshot> getHdfsDirSnapshots(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) throws IOException
IOExceptionpublic static boolean isChildOfDelta(org.apache.hadoop.fs.Path childDir,
org.apache.hadoop.fs.Path rootPath)
public static boolean isTablePropertyTransactional(Properties props)
public static boolean isTablePropertyTransactional(Map<String,String> parameters)
public static boolean isDeleteDelta(org.apache.hadoop.fs.Path p)
p - - not nullpublic static boolean isInsertDelta(org.apache.hadoop.fs.Path p)
public static boolean isTransactionalTable(CreateTableDesc table)
public static boolean isTransactionalTable(Table table)
public static boolean isTransactionalTable(org.apache.hadoop.hive.metastore.api.Table table)
public static boolean isTransactionalView(CreateMaterializedViewDesc view)
public static boolean isFullAcidTable(CreateTableDesc td)
public static boolean isFullAcidTable(Table table)
TxnUtils.isAcidTable(org.apache.hadoop.hive.metastore.api.Table)public static boolean isFullAcidTable(org.apache.hadoop.hive.metastore.api.Table table)
TxnUtils.isAcidTable(org.apache.hadoop.hive.metastore.api.Table)public static boolean isFullAcidScan(org.apache.hadoop.conf.Configuration conf)
public static boolean isInsertOnlyFetchBucketId(org.apache.hadoop.conf.Configuration conf)
public static void setAcidOperationalProperties(org.apache.hadoop.conf.Configuration conf,
boolean isTxnTable,
AcidUtils.AcidOperationalProperties properties)
conf - Mutable configuration objectproperties - An acidOperationalProperties object to initialize from. If this is null,
we assume this is a full transactional table.public static void setAcidOperationalProperties(Map<String,String> parameters, boolean isTxnTable, AcidUtils.AcidOperationalProperties properties)
parameters - Mutable map objectproperties - An acidOperationalProperties object to initialize from.public static AcidUtils.AcidOperationalProperties getAcidOperationalProperties(Table table)
table - A table objectpublic static AcidUtils.AcidOperationalProperties getAcidOperationalProperties(org.apache.hadoop.conf.Configuration conf)
conf - A configuration objectpublic static AcidUtils.AcidOperationalProperties getAcidOperationalProperties(Properties props)
props - A properties objectpublic static AcidUtils.AcidOperationalProperties getAcidOperationalProperties(Map<String,String> parameters)
parameters - A parameters objectpublic static long getLogicalLength(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.FileStatus file)
throws IOException
DELTA_SIDE_FILE_SUFFIX.
Returns the logical end of file for an acid data file.
This relies on the fact that if delta_x_y has no committed transactions it wil be filtered out
by getAcidState(FileSystem, Path, Configuration, ValidWriteIdList, Ref, boolean)
and so won't be read at all.file - - data file to read/compute splits onIOExceptionpublic static boolean isInsertOnlyTable(Map<String,String> params)
params - table propertiespublic static boolean isInsertOnlyTable(Table table)
public static boolean isInsertOnlyTable(Properties params)
public static Boolean isToInsertOnlyTable(Table tbl, Map<String,String> props)
tbl - object image before alter table command (or null if not retrieved yet).props - prop values set in this alter table commandpublic static boolean canBeMadeAcid(String fullTableName, org.apache.hadoop.hive.metastore.api.StorageDescriptor sd)
public static ValidTxnWriteIdList getValidTxnWriteIdList(org.apache.hadoop.conf.Configuration conf)
public static ValidWriteIdList getTableValidWriteIdList(org.apache.hadoop.conf.Configuration conf, String fullTableName)
public static void setValidWriteIdList(org.apache.hadoop.conf.Configuration conf,
ValidWriteIdList validWriteIds)
public static void setValidWriteIdList(org.apache.hadoop.conf.Configuration conf,
TableScanDesc tsDesc)
public static AcidUtils.TableSnapshot getTableSnapshot(org.apache.hadoop.conf.Configuration conf, Table tbl) throws LockException
LockExceptionpublic static AcidUtils.TableSnapshot getTableSnapshot(org.apache.hadoop.conf.Configuration conf, Table tbl, boolean isStatsUpdater) throws LockException
LockExceptionpublic static AcidUtils.TableSnapshot getTableSnapshot(org.apache.hadoop.conf.Configuration conf, Table tbl, String dbName, String tblName, boolean isStatsUpdater) throws LockException, AssertionError
LockExceptionAssertionErrorpublic static ValidWriteIdList getTableValidWriteIdListWithTxnList(org.apache.hadoop.conf.Configuration conf, String dbName, String tableName) throws LockException
conf - ConfigurationdbName - tableName - LockExceptionpublic static List<org.apache.hadoop.fs.FileStatus> getAcidFilesForStats(Table table, org.apache.hadoop.fs.Path dir, org.apache.hadoop.conf.Configuration jc, org.apache.hadoop.fs.FileSystem fs) throws IOException
IOExceptionpublic static List<org.apache.hadoop.fs.Path> getValidDataPaths(org.apache.hadoop.fs.Path dataPath, org.apache.hadoop.conf.Configuration conf, String validWriteIdStr) throws IOException
IOExceptionpublic static String getAcidSubDir(org.apache.hadoop.fs.Path dataPath)
public static String getFirstLevelAcidDirPath(org.apache.hadoop.fs.Path dataPath, org.apache.hadoop.fs.FileSystem fileSystem) throws IOException
IOExceptionpublic static boolean isAcidEnabled(HiveConf hiveConf)
public static Long extractWriteId(org.apache.hadoop.fs.Path file)
public static List<org.apache.hadoop.hive.metastore.api.LockComponent> makeLockComponents(Set<WriteEntity> outputs, Set<ReadEntity> inputs, Context.Operation operation, HiveConf conf)
outputs - write entitiesinputs - read entitiesconf - public static boolean isExclusiveCTASEnabled(org.apache.hadoop.conf.Configuration conf)
public static boolean isExclusiveCTAS(Set<WriteEntity> outputs, HiveConf conf)
public static void validateAcidFiles(Table table, org.apache.hadoop.fs.FileStatus[] srcs, org.apache.hadoop.fs.FileSystem fs) throws SemanticException
SemanticExceptionpublic static void validateAcidPartitionLocation(String location, org.apache.hadoop.conf.Configuration conf) throws SemanticException
SemanticExceptionpublic static org.apache.hadoop.hive.metastore.api.TxnType getTxnType(org.apache.hadoop.conf.Configuration conf, ASTNode tree)
tree - ASTpublic static String getPathSuffix(long txnId)
public static void initDirCache(int durationInMts)
public static AcidDirectory getAcidStateFromCache(Supplier<org.apache.hadoop.fs.FileSystem> fileSystem, org.apache.hadoop.fs.Path candidateDirectory, org.apache.hadoop.conf.Configuration conf, ValidWriteIdList writeIdList, Ref<Boolean> useFileIds, boolean ignoreEmptyFiles) throws IOException
fileSystem - file system suppliercandidateDirectory - the partition directory to analyzeconf - the configurationwriteIdList - the list of write ids that we are readinguseFileIds - It will be set to true, if the FileSystem supports listing with fileIdsignoreEmptyFiles - Ignore files with 0 lengthIOException - on errorspublic static void tryInvalidateDirCache(org.apache.hadoop.hive.metastore.api.Table table)
public static boolean isNonNativeAcidTable(Table table)
public static List<VirtualColumn> getAcidVirtualColumns(Table table)
HiveStorageHandler.acidVirtualColumns().table - The table for which we run the querypublic static boolean acidTableWithoutTransactions(Table table)
public static String getPartitionName(Map<String,String> partitionSpec) throws SemanticException
SemanticExceptionpublic static org.apache.hadoop.hive.metastore.api.CompactionType compactionTypeStr2ThriftType(String inputValue) throws SemanticException
SemanticExceptionpublic static CompactionState compactionStateStr2Enum(String inputValue) throws SemanticException
SemanticExceptionCopyright © 2022 The Apache Software Foundation. All rights reserved.