public final class Paths extends Object
| Modifier and Type | Method and Description |
|---|---|
static String |
addUUID(String pathStr,
String uuid)
Insert the UUID to a path if it is not there already.
|
static void |
clearTempFolderInfo(org.apache.hadoop.mapreduce.TaskAttemptID attemptID)
Remove all information held about task attempts.
|
static org.apache.hadoop.fs.Path |
getLocalTaskAttemptTempDir(org.apache.hadoop.conf.Configuration conf,
String uuid,
org.apache.hadoop.mapreduce.TaskAttemptID attemptID)
Get the task attempt temporary directory in the local filesystem.
|
static org.apache.hadoop.fs.Path |
getMultipartUploadCommitsDirectory(org.apache.hadoop.conf.Configuration conf,
String uuid)
Build a qualified temporary path for the multipart upload commit
information in the cluster filesystem.
|
static String |
getParent(String pathStr)
Get the parent path of a string path: everything up to but excluding
the last "/" in the path.
|
protected static String |
getPartition(String relative)
Returns the partition of a relative file path, or null if the path is a
file name with no relative directory.
|
static Set<String> |
getPartitions(org.apache.hadoop.fs.Path attemptPath,
List<? extends org.apache.hadoop.fs.FileStatus> taskOutput)
Get the set of partitions from the list of files being staged.
|
static String |
getRelativePath(org.apache.hadoop.fs.Path basePath,
org.apache.hadoop.fs.Path fullPath)
Using
URI#relativize(), build the relative path from the
base path to the full path. |
static org.apache.hadoop.fs.Path |
getStagingUploadsParentDirectory(org.apache.hadoop.conf.Configuration conf,
String uuid)
Build a qualified parent path for the temporary multipart upload commit
directory built by
getMultipartUploadCommitsDirectory(Configuration, String). |
static org.apache.hadoop.fs.Path |
path(org.apache.hadoop.fs.Path parent,
String... child)
Varags constructor of paths.
|
static void |
resetTempFolderCache()
Reset the temp folder cache; useful in tests.
|
static org.apache.hadoop.fs.Path |
tempDirForStaging(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.conf.Configuration conf)
Try to come up with a good temp directory for different filesystems.
|
public static String addUUID(String pathStr, String uuid)
/example/part-0000 ==> /example/part-0000-0ab34 /example/part-0001.gz.csv ==> /example/part-0001-0ab34.gz.csv /example/part-0002-0abc3.gz.csv ==> /example/part-0002-0abc3.gz.csv /example0abc3/part-0002.gz.csv ==> /example0abc3/part-0002.gz.csv
pathStr - path as a string; must not have a trailing "/".uuid - UUID to append; must not be emptypublic static String getParent(String pathStr)
pathStr - path as a stringpublic static String getRelativePath(org.apache.hadoop.fs.Path basePath, org.apache.hadoop.fs.Path fullPath)
URI#relativize(), build the relative path from the
base path to the full path.
If childPath is not a child of basePath the outcome
os undefined.basePath - base pathfullPath - full path under the base path.public static org.apache.hadoop.fs.Path path(org.apache.hadoop.fs.Path parent,
String... child)
parent - parent pathchild - child entries. "" elements are skipped.public static org.apache.hadoop.fs.Path getLocalTaskAttemptTempDir(org.apache.hadoop.conf.Configuration conf,
String uuid,
org.apache.hadoop.mapreduce.TaskAttemptID attemptID)
throws IOException
conf - configurationuuid - some UUID, such as a job UUIDattemptID - attempt IDIOException - IO problem.public static void clearTempFolderInfo(org.apache.hadoop.mapreduce.TaskAttemptID attemptID)
attemptID - attempt ID.@VisibleForTesting public static void resetTempFolderCache()
public static org.apache.hadoop.fs.Path tempDirForStaging(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.conf.Configuration conf)
fs - filesystemconf - configurationpublic static org.apache.hadoop.fs.Path getStagingUploadsParentDirectory(org.apache.hadoop.conf.Configuration conf,
String uuid)
throws IOException
getMultipartUploadCommitsDirectory(Configuration, String).conf - configuration defining default FS.uuid - uuid of jobIOException - on an IO failure.public static org.apache.hadoop.fs.Path getMultipartUploadCommitsDirectory(org.apache.hadoop.conf.Configuration conf,
String uuid)
throws IOException
getMultipartUploadCommitsDirectory(FileSystem, Configuration, String)conf - configuration defining default FS.uuid - uuid of jobIOException - on an IO failure.protected static String getPartition(String relative)
relative - a relative file pathpublic static Set<String> getPartitions(org.apache.hadoop.fs.Path attemptPath, List<? extends org.apache.hadoop.fs.FileStatus> taskOutput) throws IOException
StagingCommitterConstants.TABLE_ROOT.attemptPath - path for the attempttaskOutput - list of output files.IOException - IO failureCopyright © 2008–2024 Apache Software Foundation. All rights reserved.