public class LogReplay
extends Object
AddFile and accompanying
metadata for any `(path, dv id)` tuple wins. - RemoveFile deletes a corresponding
AddFile. A RemoveFile "corresponds" to the AddFile that matches both the parquet file URI
*and* the deletion vector's URI (if any). - The most recent Metadata wins. - The most
recent Protocol version wins. - For each `(path, dv id)` tuple, this class should always
output only one FileAction (either AddFile or RemoveFile)
This class exposes the following public APIs - getProtocol(): latest non-null
Protocol - getMetadata(): latest non-null Metadata - getAddFilesAsColumnarBatches(io.delta.kernel.engine.Engine, boolean, java.util.Optional<io.delta.kernel.expressions.Predicate>): return all active (not tombstoned) AddFiles as ColumnarBatchs
| Modifier and Type | Field and Description |
|---|---|
static int |
ADD_FILE_DV_ORDINAL |
static int |
ADD_FILE_ORDINAL |
static int |
ADD_FILE_PATH_ORDINAL |
static String |
ADDFILE_FIELD_NAME |
static StructType |
DOMAIN_METADATA_READ_SCHEMA
Read schema when searching for just the domain metadata
|
static StructType |
PROTOCOL_METADATA_READ_SCHEMA
Read schema when searching for the latest Protocol and Metadata.
|
static int |
REMOVE_FILE_DV_ORDINAL |
static int |
REMOVE_FILE_ORDINAL |
static int |
REMOVE_FILE_PATH_ORDINAL |
static String |
REMOVEFILE_FIELD_NAME |
static StructType |
SET_TRANSACTION_READ_SCHEMA
Read schema when searching for just the transaction identifiers
|
static String |
SIDECAR_FIELD_NAME |
| Constructor and Description |
|---|
LogReplay(Path logPath,
Path dataPath,
long snapshotVersion,
Engine engine,
LogSegment logSegment,
java.util.Optional<SnapshotHint> snapshotHint) |
| Modifier and Type | Method and Description |
|---|---|
static boolean |
containsAddOrRemoveFileActions(StructType schema) |
CloseableIterator<FilteredColumnarBatch> |
getAddFilesAsColumnarBatches(Engine engine,
boolean shouldReadStats,
java.util.Optional<Predicate> checkpointPredicate)
Returns an iterator of
FilteredColumnarBatch representing all the active AddFiles in
the table. |
static StructType |
getAddRemoveReadSchema(boolean shouldReadStats)
Read schema when searching for all the active AddFiles
|
java.util.Map<String,DomainMetadata> |
getDomainMetadataMap() |
java.util.Optional<Long> |
getLatestTransactionIdentifier(Engine engine,
String applicationId) |
Metadata |
getMetadata() |
Protocol |
getProtocol() |
static StructType |
withSidecarFileSchema(StructType schema) |
public static final StructType PROTOCOL_METADATA_READ_SCHEMA
public static final StructType SET_TRANSACTION_READ_SCHEMA
public static final StructType DOMAIN_METADATA_READ_SCHEMA
public static String SIDECAR_FIELD_NAME
public static String ADDFILE_FIELD_NAME
public static String REMOVEFILE_FIELD_NAME
public static int ADD_FILE_ORDINAL
public static int ADD_FILE_PATH_ORDINAL
public static int ADD_FILE_DV_ORDINAL
public static int REMOVE_FILE_ORDINAL
public static int REMOVE_FILE_PATH_ORDINAL
public static int REMOVE_FILE_DV_ORDINAL
public LogReplay(Path logPath, Path dataPath, long snapshotVersion, Engine engine, LogSegment logSegment, java.util.Optional<SnapshotHint> snapshotHint)
public static StructType withSidecarFileSchema(StructType schema)
public static boolean containsAddOrRemoveFileActions(StructType schema)
public static StructType getAddRemoveReadSchema(boolean shouldReadStats)
public Protocol getProtocol()
public Metadata getMetadata()
public java.util.Optional<Long> getLatestTransactionIdentifier(Engine engine, String applicationId)
public java.util.Map<String,DomainMetadata> getDomainMetadataMap()
public CloseableIterator<FilteredColumnarBatch> getAddFilesAsColumnarBatches(Engine engine, boolean shouldReadStats, java.util.Optional<Predicate> checkpointPredicate)
FilteredColumnarBatch representing all the active AddFiles in
the table.
Statistics are conditionally read for the AddFiles based on shouldReadStats. The
returned batches have schema:
add
type: AddFile.SCHEMA_WITH_STATS if shouldReadStats=true, otherwise
AddFile.SCHEMA_WITHOUT_STATS