package metering
Ordering
- Alphabetic
Visibility
- Public
- Protected
Type Members
- trait DeltaLogging extends DeltaProgressReporter with DatabricksLogging
Convenience wrappers for logging that include delta specific options and avoids the need to predeclare all operations.
Convenience wrappers for logging that include delta specific options and avoids the need to predeclare all operations. Metrics in Delta should respect the following conventions:
- Tags should identify the context of the event (which shard, user, table, machine, etc).
- All actions initiated by a user should be wrapped in a recordOperation so we can track usage latency and failures. If there is a significant (more than a few seconds) subaction like identifying candidate files, consider nested recordOperation.
- Events should be used to return detailed statistics about usage. Generally these should be defined with a case class to ease analysis later.
- Events can also be used to record that a particular codepath was hit (i.e. a checkpoint failure, a conflict, or a specific optimization).
- Both events and operations should be named hierarchically to allow for analysis at different levels. For example, to look at the latency of all DDL operations we could scan for operations that match "delta.ddl.%".
Underneath these functions use the standard usage log reporting defined in com.databricks.spark.util.DatabricksLogging.
- case class ScanReport(tableId: String, path: String, scanType: String, deltaDataSkippingType: String, partitionFilters: Seq[String], dataFilters: Seq[String], unusedFilters: Seq[String], size: Map[String, DataSize], metrics: Map[String, Long], versionScanned: Option[Long], annotations: Map[String, Long], usedPartitionColumns: Seq[String], numUsedPartitionColumns: Long, allPartitionColumns: Seq[String], numAllPartitionColumns: Long, parentFilterOutputRows: Option[Long]) extends Product with Serializable
Value Members
- object DeltaLogging
- object ScanReport extends Serializable