Packages

t

org.apache.spark.sql.rapids

ColumnarWriteTaskStatsTracker

trait ColumnarWriteTaskStatsTracker extends AnyRef

A trait for classes that are capable of collecting statistics on columnar data that's being processed by a single write task in GpuFileFormatDataWriter - i.e. there should be one instance per executor.

This trait is coupled with the way GpuFileFormatWriter works, in the sense that its methods will be called according to how column batches are being written out to disk, namely in sorted order according to partitionValue(s), then bucketId.

As such, a typical call scenario is:

newPartition -> newBucket -> newFile -> newRow -. ^ |^_ ^| | | || | || ||

newPartition and newBucket events are only triggered if the relation to be written out is partitioned and/or bucketed, respectively.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ColumnarWriteTaskStatsTracker
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def getFinalStats(): WriteTaskStats

    Returns the final statistics computed so far.

    Returns the final statistics computed so far.

    returns

    An object of subtype of org.apache.spark.sql.execution.datasources.WriteTaskStats, to be sent to the driver.

    Note

    This may only be called once. Further use of the object may lead to undefined behavior.

  2. abstract def newBatch(batch: ColumnarBatch): Unit

    Process a new column batch to update the tracked statistics accordingly.

    Process a new column batch to update the tracked statistics accordingly. The batch will be written to the most recently witnessed file (via newFile).

    batch

    Current data batch to be processed.

  3. abstract def newBucket(bucketId: Int): Unit

    Process the fact that a new bucket is about to written.

    Process the fact that a new bucket is about to written. Only triggered when the relation is bucketed by a (non-empty) sequence of columns.

    bucketId

    The bucket number.

  4. abstract def newFile(filePath: String): Unit

    Process the fact that a new file is about to be written.

    Process the fact that a new file is about to be written.

    filePath

    Path of the file into which future rows will be written.

  5. abstract def newPartition(): Unit

    Process the fact that a new partition is about to be written.

    Process the fact that a new partition is about to be written. Only triggered when the relation is partitioned by a (non-empty) sequence of columns. NOTE: The partition values are stubbed for now as the original code only updated a count of partitions without examining the values. //@param partitionValues The values that define this new partition.

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  10. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  13. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  14. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  15. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  16. def toString(): String
    Definition Classes
    AnyRef → Any
  17. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  18. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  19. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped