Package io.delta.kernel
Interface Transaction
- All Known Implementing Classes:
TransactionImpl
Represents a transaction to mutate a Delta table.
- Since:
- 3.2.0
-
Method Summary
Modifier and TypeMethodDescriptioncommit(Engine engine, CloseableIterable<Row> dataActions) Commit the transaction including the data action rows generated bygenerateAppendActions(io.delta.kernel.engine.Engine, io.delta.kernel.data.Row, io.delta.kernel.utils.CloseableIterator<io.delta.kernel.utils.DataFileStatus>, io.delta.kernel.DataWriteContext).static CloseableIterator<Row>generateAppendActions(Engine engine, Row transactionState, CloseableIterator<DataFileStatus> fileStatusIter, DataWriteContext dataWriteContext) For given data files, generate Delta actions that can be committed in a transaction.getPartitionColumns(Engine engine) Get the list of logical names of the partition columns.Get the schema of the table.getTransactionState(Engine engine) Get the state of the transaction.static DataWriteContextGet the context for writing data into a table.transformLogicalData(Engine engine, Row transactionState, CloseableIterator<FilteredColumnarBatch> dataIter, Map<String, Literal> partitionValues) Given the logical data that needs to be written to the table, convert it into the required physical data depending upon the table Delta protocol and features enabled on the table.
-
Method Details
-
getSchema
Get the schema of the table. If the connector is adding any data to the table through this transaction, it should have the same schema as the table schema. -
getPartitionColumns
Get the list of logical names of the partition columns. This helps the connector to do physical partitioning of the data before asking the Kernel to stage the data per partition. -
getTransactionState
Get the state of the transaction. The state helps Kernel do the transformations to logical data according to the Delta protocol and table features enabled on the table. The engine should use this at the data writer task to transform the logical data that the engine wants to write to the table in to physical data that goes in data files usingtransformLogicalData(Engine, Row, CloseableIterator, Map) -
commit
TransactionCommitResult commit(Engine engine, CloseableIterable<Row> dataActions) throws ConcurrentWriteException Commit the transaction including the data action rows generated bygenerateAppendActions(io.delta.kernel.engine.Engine, io.delta.kernel.data.Row, io.delta.kernel.utils.CloseableIterator<io.delta.kernel.utils.DataFileStatus>, io.delta.kernel.DataWriteContext).- Parameters:
engine-Engineinstance.dataActions- Iterable of data actions to commit. These data actions are generated by thegenerateAppendActions(Engine, Row, CloseableIterator, DataWriteContext). TheCloseableIterableallows the Kernel to access the list of actions multiple times (in case of retries to resolve the conflicts due to other writers to the table). Kernel provides a in-memory based implementation ofCloseableIterablewith utility APICloseableIterable.inMemoryIterable(CloseableIterator)- Returns:
TransactionCommitResultstatus of the successful transaction.- Throws:
ConcurrentWriteException- when the transaction has encountered a non-retryable conflicts or exceeded the maximum number of retries reached. The connector needs to rerun the query on top of the latest table state and retry the transaction.
-
transformLogicalData
static CloseableIterator<FilteredColumnarBatch> transformLogicalData(Engine engine, Row transactionState, CloseableIterator<FilteredColumnarBatch> dataIter, Map<String, Literal> partitionValues) Given the logical data that needs to be written to the table, convert it into the required physical data depending upon the table Delta protocol and features enabled on the table. Kernel takes care of adding any additional column or removing existing columns that doesn't need to be in physical data files. All these transformations are driven by the Delta protocol and table features enabled on the table.The given data should belong to exactly one partition. It is the job of the connector to do partitioning of the data before calling the API. Partition values are provided as map of column name to partition value (as
Literal). If the table is an un-partitioned table, then map should be empty.- Parameters:
engine-Engineinstance to use.transactionState- The transaction statedataIter- Iterator of logical data (with schema same as the table schema) to transform to physical data. All the data n this iterator should belong to one physical partition and it should also include the partition data.partitionValues- The partition values for the data. If the table is un-partitioned, the map should be empty- Returns:
- Iterator of physical data to write to the data files.
-
getWriteContext
static DataWriteContext getWriteContext(Engine engine, Row transactionState, Map<String, Literal> partitionValues) Get the context for writing data into a table. The context tells the connector where the data should be written. For partitioned table context is generated per partition. So, the connector should call this API for each partition. For un-partitioned table, the context is same for all the data.- Parameters:
engine-Engineinstance to use.transactionState- The transaction statepartitionValues- The partition values for the data. If the table is un-partitioned, the map should be empty- Returns:
DataWriteContextcontaining metadata about where and how the data for partition should be written.
-
generateAppendActions
static CloseableIterator<Row> generateAppendActions(Engine engine, Row transactionState, CloseableIterator<DataFileStatus> fileStatusIter, DataWriteContext dataWriteContext) For given data files, generate Delta actions that can be committed in a transaction. These data files are the result of writing the data returned bytransformLogicalData(io.delta.kernel.engine.Engine, io.delta.kernel.data.Row, io.delta.kernel.utils.CloseableIterator<io.delta.kernel.data.FilteredColumnarBatch>, java.util.Map<java.lang.String, io.delta.kernel.expressions.Literal>)with the context returned bygetWriteContext(io.delta.kernel.engine.Engine, io.delta.kernel.data.Row, java.util.Map<java.lang.String, io.delta.kernel.expressions.Literal>).- Parameters:
engine-Engineinstance.transactionState- State of the transaction.fileStatusIter- Iterator of row objects representing each data file written.dataWriteContext- The context used when writing the data files given infileStatusIter- Returns:
CloseableIteratorofRowrepresenting the actions to commit usingcommit(io.delta.kernel.engine.Engine, io.delta.kernel.utils.CloseableIterable<io.delta.kernel.data.Row>).
-