Packages

t

com.nvidia.spark.rapids.window

BatchedUnboundedToUnboundedWindowFixer

trait BatchedUnboundedToUnboundedWindowFixer extends AutoCloseable

Provides a way to process window operations without needing to buffer and split the batches on partition by boundaries. When this happens part of a partition by key set may have been processed in the previous batches, and may need to be updated. For example if we are doing a min operation with unbounded preceding and unbounded following. We may first get in something like PARTS: 1, 1, 2, 2 VALUES: 2, 3, 10, 9

The output of processing this would result in a new column that would look like MINS: 2, 2, 9, 9

But we don't know if the group with 2 in PARTS is done or not. So the fixer saved the last value in MINS, which is a 9, and caches the batch. When the next batch shows up

PARTS: 2, 2, 3, 3 VALUES: 11, 5, 13, 14

We generate the window result again and get

MINS: 5, 5, 13, 13

And now we need to grab the first entry which is a 5 and update the cached data with another min. The cached data for PARTS=2 is now 5. We then need to go back and fix up all of the previous batches that had something to do with PARTS=2. The first batch will be pulled from the cache and updated to look like

PARTS: 1, 1, 2, 2 VALUES: 2, 3, 10, 9 MINS: 2, 2, 5, 5 which can be output because we were able to fix up all of the PARTS in that batch.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. BatchedUnboundedToUnboundedWindowFixer
  2. AutoCloseable
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def close(): Unit
    Definition Classes
    AutoCloseable
    Annotations
    @throws( classOf[java.lang.Exception] )
  2. abstract def fixUp(samePartitionMask: Either[ColumnVector, Boolean], column: ColumnVector): ColumnVector

    Called to fix up a batch.

    Called to fix up a batch. There is no guarantee on the order the batches are fixed. The only ordering guarantee is that the state will be updated for all batches before any are "fixed"

    samePartitionMask

    indicates which rows are a part of the same partition.

    column

    the column of data to be fixed.

    returns

    a column of data that was fixed.

  3. abstract def reset(): Unit

    Clear any state so that updateState can be called again for a new partition by group.

  4. abstract def updateState(scalar: Scalar): Unit

    Cache and update any state needed.

    Cache and update any state needed. Because this is specific to unbounded preceding to unbounded following the result should be the same for any row within a batch. As such, this is only guaranteed to be called once per batch with the value from a row within the batch.

    scalar

    the value to use to update what is cached.

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  10. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  13. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  14. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  15. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  16. def toString(): String
    Definition Classes
    AnyRef → Any
  17. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  18. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  19. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from AutoCloseable

Inherited from AnyRef

Inherited from Any

Ungrouped