public class DirMarkerTracker extends Object
Designed to be used while scanning through the results of listObject calls, where are we assume the results come in alphanumeric sort order and parent entries before children.
This lets as assume that we can identify all leaf markers as those markers which were added to set of leaf markers and not subsequently removed as a child entries were discovered.
To avoid scanning datastructures excessively, the path of the parent directory of the last file added is cached. This allows for a quick bailout when many children of the same directory are returned in a listing.
Consult the directory_markers document for details on this feature, including terminology.
| Modifier and Type | Class and Description |
|---|---|
static class |
DirMarkerTracker.Marker
This is a marker entry stored in the map and
returned as markers are deleted.
|
| Constructor and Description |
|---|
DirMarkerTracker(org.apache.hadoop.fs.Path basePath,
boolean recordSurplusMarkers)
Construct.
|
| Modifier and Type | Method and Description |
|---|---|
List<DirMarkerTracker.Marker> |
fileFound(org.apache.hadoop.fs.Path path,
String key,
S3ALocatedFileStatus source)
A file has been found.
|
org.apache.hadoop.fs.Path |
getBasePath()
Get the base path of the tracker.
|
int |
getFilesFound() |
org.apache.hadoop.fs.Path |
getLastDirChecked() |
Map<org.apache.hadoop.fs.Path,DirMarkerTracker.Marker> |
getLeafMarkers()
Get the map of leaf markers.
|
int |
getMarkersFound() |
int |
getObjectsFound()
How many objects were found.
|
int |
getScanCount() |
Map<org.apache.hadoop.fs.Path,DirMarkerTracker.Marker> |
getSurplusMarkers()
Get the map of surplus markers.
|
List<DirMarkerTracker.Marker> |
markerFound(org.apache.hadoop.fs.Path path,
String key,
S3ALocatedFileStatus source)
A marker has been found; this may or may not be a leaf.
|
List<org.apache.hadoop.fs.Path> |
removeAllowedMarkers(DirectoryPolicy policy)
Scan the surplus marker list and remove from it all where the directory
policy says "keep".
|
String |
toString() |
public DirMarkerTracker(org.apache.hadoop.fs.Path basePath,
boolean recordSurplusMarkers)
The base path is currently only used for information rather than validating paths supplied in other methods.
basePath - base path of trackrecordSurplusMarkers - save surplus markers to a map?public org.apache.hadoop.fs.Path getBasePath()
public List<DirMarkerTracker.Marker> markerFound(org.apache.hadoop.fs.Path path, String key, S3ALocatedFileStatus source)
Trigger a move of all markers above it into the surplus map.
path - marker pathkey - object keysource - listing sourcepublic List<DirMarkerTracker.Marker> fileFound(org.apache.hadoop.fs.Path path, String key, S3ALocatedFileStatus source)
path - marker pathkey - object keysource - listing sourcepublic Map<org.apache.hadoop.fs.Path,DirMarkerTracker.Marker> getLeafMarkers()
public Map<org.apache.hadoop.fs.Path,DirMarkerTracker.Marker> getSurplusMarkers()
Empty if they were not being recorded.
public org.apache.hadoop.fs.Path getLastDirChecked()
public int getObjectsFound()
public int getScanCount()
public int getFilesFound()
public int getMarkersFound()
public List<org.apache.hadoop.fs.Path> removeAllowedMarkers(DirectoryPolicy policy)
policy - policy to use when auditing markers for
inclusion/exclusion.Copyright © 2008–2024 Apache Software Foundation. All rights reserved.