Packages

package rapids

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. abstract class AbstractGpuCoalesceIterator extends Iterator[ColumnarBatch] with Arm with Logging
  2. class AcceleratedColumnarToRowIterator extends Iterator[InternalRow] with Arm with Serializable

    An iterator that uses the GPU for columnar to row conversion of fixed width types.

  3. class AddressSpaceAllocator extends AnyRef

    Allocates blocks from an address space using a best-fit algorithm.

  4. case class AggAndReplace(agg: Aggregation, nullReplacePolicy: Option[ReplacePolicy]) extends Product with Serializable

    For Scan and GroupBy Scan aggregations nulls are not always treated the same way as they are in window operations.

    For Scan and GroupBy Scan aggregations nulls are not always treated the same way as they are in window operations. Often we have to run a post processing step and replace them. This groups those two together so we can have a complete picture of how to perform these types of aggregations.

  5. abstract class AggExprMeta[INPUT <: AggregateFunction] extends ExprMeta[INPUT]

    Base class for metadata around AggregateFunction.

  6. case class AggregateModeInfo(uniqueModes: Seq[AggregateMode], hasPartialMode: Boolean, hasPartialMergeMode: Boolean, hasFinalMode: Boolean, hasCompleteMode: Boolean) extends Product with Serializable

    Utility class to convey information on the aggregation modes being used

  7. case class AllowSpillOnlyLazySpillableColumnarBatchImpl(wrapped: LazySpillableColumnarBatch) extends LazySpillableColumnarBatch with Product with Serializable

    A version of LazySpillableColumnarBatch where instead of closing the underlying batch it is only spilled.

    A version of LazySpillableColumnarBatch where instead of closing the underlying batch it is only spilled. This is used for cases, like with a streaming hash join where the data itself needs to out live the JoinGatherer it is haded off to.

  8. trait Arm extends AnyRef

    Implementation of the automatic-resource-management pattern

  9. class AutoCloseColumnBatchIterator[U] extends Iterator[ColumnarBatch]

    For columnar code on the CPU it is the responsibility of the SparkPlan exec that creates a ColumnarBatch to close it.

    For columnar code on the CPU it is the responsibility of the SparkPlan exec that creates a ColumnarBatch to close it. In the case of code running on the GPU that would waste too much memory, so it is the responsibility of the code receiving the batch to close it, when it is not longer needed.

    This class provides a simple way for CPU batch code to be sure that a batch gets closed. If your code is executing on the GPU do not use this class.

  10. case class AvoidAdaptiveTransitionToRow(child: SparkPlan) extends SparkPlan with UnaryExecNode with GpuExec with Product with Serializable

    This operator will attempt to optimize the case when we are writing the results of an adaptive query to disk so that we remove the redundant transitions from columnar to row within AdaptiveSparkPlanExec followed by a row to columnar transition.

    This operator will attempt to optimize the case when we are writing the results of an adaptive query to disk so that we remove the redundant transitions from columnar to row within AdaptiveSparkPlanExec followed by a row to columnar transition.

    Specifically, this is the plan we see in this case:

    GpuRowToColumnar(AdaptiveSparkPlanExec(GpuColumnarToRow(child))

    We perform this optimization at runtime rather than during planning, because when the adaptive plan is being planned and executed, we don't know whether it is being called from an operation that wants rows (such as CollectTailExec) or from an operation that wants columns (such as GpuDataWritingCommandExec).

    Spark does not provide a mechanism for executing an adaptive plan and retrieving columnar results and the internal methods that we need to call are private, so we use reflection to call them.

    child

    The plan to execute

  11. case class AvoidTransition[INPUT <: SparkPlan](plan: SparkPlanMeta[INPUT]) extends Optimization with Product with Serializable
  12. abstract class BaseCrossJoinGatherMap extends LazySpillableGatherMap
  13. abstract class BaseExprMeta[INPUT <: Expression] extends RapidsMeta[INPUT, Expression, Expression]

    Base class for metadata around Expression.

  14. trait BasicWindowCalc extends Arm

    Calculates the results of window operations.

    Calculates the results of window operations. It assumes that any batching of the data or fixups after the fact to get the right answer is done outside of this.

  15. abstract class BatchedBufferDecompressor extends AutoCloseable with Arm with Logging

    Base class for batched decompressors

  16. case class BatchedByKey(order: Seq[SortOrder]) extends CoalesceGoal with Product with Serializable

    Split the data into batches where a set of keys are all within a single batch.

    Split the data into batches where a set of keys are all within a single batch. This is generally used for things like a window operation or a sort based aggregation where you want all of the keys for a given operation to be available so the GPU can produce a correct answer. There is no limit on the target size so if there is a lot of data skew for a key, the batch may still run into limits on set by Spark or cudf. It should be noted that it is required that a node in the Spark plan that requires this should also require an input ordering that satisfies this ordering as well.

    order

    the keys that should be used for batching.

  17. class BatchedCopyCompressor extends BatchedTableCompressor
  18. class BatchedCopyDecompressor extends BatchedBufferDecompressor
  19. class BatchedNvcompLZ4Compressor extends BatchedTableCompressor
  20. class BatchedNvcompLZ4Decompressor extends BatchedBufferDecompressor
  21. class BatchedRunningWindowBinaryFixer extends BatchedRunningWindowFixer with Arm with Logging

    This class fixes up batched running windows by performing a binary op on the previous value and those in the the same partition by key group.

    This class fixes up batched running windows by performing a binary op on the previous value and those in the the same partition by key group. It does not deal with nulls, so it works for things like row_number and count, that cannot produce nulls, or for NULL_MIN and NULL_MAX that do the right thing when they see a null.

  22. trait BatchedRunningWindowFixer extends AutoCloseable

    Provides a way to process running window operations without needing to buffer and split the batches on partition by boundaries.

    Provides a way to process running window operations without needing to buffer and split the batches on partition by boundaries. When this happens part of a partition by key set may have been processed in the last batch, and the rest of it will need to be updated. For example if we are doing a running min operation. We may first get in something like PARTS: 1, 1, 2, 2 VALUES: 2, 3, 10, 9

    The output of processing this would result in a new column that would look like MINS: 2, 2, 10, 9

    But we don't know if the group with 2 in PARTS is done or not. So the fixer saved the last value in MINS, which is a 9. When the next batch shows up

    PARTS: 2, 2, 3, 3 VALUES: 11, 5, 13, 14

    We generate the window result again and get

    MINS: 11, 5, 13, 13

    But we cannot output this yet because there may have been overlap with the previous batch. The framework will figure that out and pass data into fixUp to do the fixing. It will pass in MINS, and also a column of boolean values true, true, false, false to indicate which rows overlapped with the previous batch. In our min example fixUp will do a min between the last value in the previous batch and the values that could overlap with it.

    RESULT: 9, 5, 13, 13 which can be output.

  23. abstract class BatchedTableCompressor extends AutoCloseable with Arm with Logging

    Base class for batched compressors

  24. abstract class BinaryExprMeta[INPUT <: BinaryExpression] extends ExprMeta[INPUT]

    Base class for metadata around BinaryExpression.

  25. case class BoundGpuWindowFunction(windowFunc: GpuWindowFunction, boundInputLocations: Array[Int]) extends Arm with Product with Serializable

    The class represents a window function and the locations of its deduped inputs after an initial projection.

  26. class CSVPartitionReader extends PartitionReader[ColumnarBatch] with ScanWithMetrics with Arm
  27. class CastChecks extends ExprChecks
  28. class CastExprMeta[INPUT <: CastBase] extends UnaryExprMeta[INPUT]

    Meta-data for cast and ansi_cast.

  29. case class ClouderaShimVersion(major: Int, minor: Int, patch: Int, clouderaVersion: String) extends ShimVersion with Product with Serializable
  30. sealed abstract class CoalesceGoal extends Expression with GpuUnevaluable

    Provides a goal for batching of data.

  31. sealed abstract class CoalesceSizeGoal extends CoalesceGoal
  32. class CollectTimeIterator extends Iterator[ColumnarBatch]
  33. trait ColumnarFileFormat extends AnyRef

    Used to write columnar data to files.

  34. abstract class ColumnarOutputWriter extends HostBufferConsumer with Arm

    This is used to write columnar data to a file system.

    This is used to write columnar data to a file system. Subclasses of ColumnarOutputWriter must provide a zero-argument constructor. This is the columnar version of org.apache.spark.sql.execution.datasources.OutputWriter.

  35. abstract class ColumnarOutputWriterFactory extends Serializable

    A factory that produces ColumnarOutputWriters.

    A factory that produces ColumnarOutputWriters. A new ColumnarOutputWriterFactory is created on the driver side, and then gets serialized to executor side to create ColumnarOutputWriters. This is the columnar version of org.apache.spark.sql.execution.datasources.OutputWriterFactory.

  36. case class ColumnarOverrideRules() extends ColumnarRule with Logging with Product with Serializable
  37. class ColumnarPartitionReaderWithPartitionValues extends PartitionReader[ColumnarBatch]

    A wrapper reader that always appends partition values to the ColumnarBatch produced by the input reader fileReader.

    A wrapper reader that always appends partition values to the ColumnarBatch produced by the input reader fileReader. Each scalar value is splatted to a column with the same number of rows as the batch returned by the reader.

  38. class ColumnarToRowIterator extends Iterator[InternalRow] with Arm
  39. abstract class ComplexTypeMergingExprMeta[INPUT <: ComplexTypeMergingExpression] extends ExprMeta[INPUT]

    Base class for metadata around ComplexTypeMergingExpression.

  40. case class CompressedTable(compressedSize: Long, meta: TableMeta, buffer: DeviceMemoryBuffer) extends AutoCloseable with Product with Serializable

    Compressed table descriptor

    Compressed table descriptor

    compressedSize

    size of the compressed data in bytes

    meta

    metadata describing the table layout when uncompressed

    buffer

    buffer containing the compressed data

  41. class ConfBuilder extends AnyRef
  42. abstract class ConfEntry[T] extends AnyRef
  43. class ConfEntryWithDefault[T] extends ConfEntry[T]
  44. case class ContextChecks(outputCheck: TypeSig, sparkOutputSig: TypeSig, paramCheck: Seq[ParamCheck] = Seq.empty, repeatingParamCheck: Option[RepeatingParamCheck] = None) extends TypeChecks[Map[String, SupportLevel]] with Product with Serializable

    Checks an expression that have input parameters and a single output.

    Checks an expression that have input parameters and a single output. This is intended to be given for a specific ExpressionContext. If your expression does not meet this pattern you may need to create a custom ExprChecks instance.

  45. class CopyCompressionCodec extends TableCompressionCodec with Arm

    A table compression codec used only for testing that copies the data.

  46. class CostBasedOptimizer extends Optimizer with Logging

    Experimental cost-based optimizer that aims to avoid moving sections of the plan to the GPU when it would be better to keep that part of the plan on the CPU.

    Experimental cost-based optimizer that aims to avoid moving sections of the plan to the GPU when it would be better to keep that part of the plan on the CPU. For example, we don't want to move data to the GPU just for a trivial projection and then have to move data back to the CPU on the next step.

  47. trait CostModel extends AnyRef

    The cost model is behind a trait so that we can consider making this pluggable in the future so that users can override the cost model to suit specific use cases.

  48. class CpuCostModel extends CostModel
  49. final class CreateDataSourceTableAsSelectCommandMeta extends DataWritingCommandMeta[CreateDataSourceTableAsSelectCommand]
  50. trait CudfBinaryExpression extends BinaryExpression with GpuBinaryExpression
  51. abstract class CudfBinaryOperator extends BinaryOperator with GpuBinaryOperator with CudfBinaryExpression
  52. trait CudfUnaryExpression extends GpuUnaryExpression
  53. case class CudfVersionMismatchException(errorMsg: String) extends PluginException with Product with Serializable
  54. trait DataBlockBase extends AnyRef
  55. trait DataFromReplacementRule extends AnyRef
  56. class DataTypeMeta extends AnyRef

    The metadata around DataType, which records the original data type, the desired data type for GPU overrides, and the reason of potential conversion.

    The metadata around DataType, which records the original data type, the desired data type for GPU overrides, and the reason of potential conversion. The metadata is to ensure TypeChecks tagging the actual data types for GPU runtime, since data types of GPU overrides may slightly differ from original CPU counterparts.

  57. abstract class DataWritingCommandMeta[INPUT <: DataWritingCommand] extends RapidsMeta[INPUT, DataWritingCommand, GpuDataWritingCommand]

    Base class for metadata around DataWritingCommand.

  58. class DataWritingCommandRule[INPUT <: DataWritingCommand] extends ReplacementRule[INPUT, DataWritingCommand, DataWritingCommandMeta[INPUT]]

    Holds everything that is needed to replace a DataWritingCommand with a GPU enabled version.

  59. case class DatabricksShimVersion(major: Int, minor: Int, patch: Int) extends ShimVersion with Product with Serializable
  60. sealed class DegenerateRapidsBuffer extends RapidsBuffer with Arm

    A buffer with no corresponding device data (zero rows or columns).

    A buffer with no corresponding device data (zero rows or columns). These buffers are not tracked in buffer stores since they have no device memory. They are only tracked in the catalog and provide a representative ColumnarBatch but cannot provide a MemoryBuffer.

  61. class DenseRankFixer extends BatchedRunningWindowFixer with Arm with Logging

    Fix up dense rank batches.

    Fix up dense rank batches. A dense rank has no gaps in the rank values. The rank corresponds to the ordering columns(s) equality. So when a batch finishes and another starts that split can either be at the beginning of a new order by section or part way through one. If it is at the beginning, then like row number we want to just add in the previous value and go on. If it was part way through, then we want to add in the previous value minus 1. The minus one is to pick up where we left off. If anything is outside of a continues partition by group then we just keep those values unchanged.

  62. class DeviceMemoryEventHandler extends RmmEventHandler with Logging

    RMM event handler to trigger spilling from the device memory store.

  63. class DirectByteBufferFactory extends ByteBufferFactory
  64. final class DoNotReplaceOrWarnSparkPlanMeta[INPUT <: SparkPlan] extends SparkPlanMeta[INPUT]

    Metadata for SparkPlan that should not be replaced or have any kind of warning for

  65. class DuplicateBufferException extends RuntimeException

    Exception thrown when inserting a buffer into the catalog with a duplicate buffer ID and storage tier combination.

  66. case class EMRShimVersion(major: Int, minor: Int, patch: Int) extends ShimVersion with Product with Serializable
  67. class ExecChecks extends TypeChecks[SupportLevel]

    Checks the input and output types supported by a SparkPlan node.

    Checks the input and output types supported by a SparkPlan node. We don't currently separate input checks from output checks. We can add this in if something needs it.

  68. class ExecRule[INPUT <: SparkPlan] extends ReplacementRule[INPUT, SparkPlan, SparkPlanMeta[INPUT]]

    Holds everything that is needed to replace a SparkPlan with a GPU enabled version.

  69. class ExecutionPlanCaptureCallback extends QueryExecutionListener

    Used as a part of testing to capture the executed query plan.

  70. abstract class ExprChecks extends TypeChecks[Map[ExpressionContext, Map[String, SupportLevel]]]

    Base class all Expression checks must follow.

  71. case class ExprChecksImpl(contexts: Map[ExpressionContext, ContextChecks]) extends ExprChecks with Product with Serializable
  72. abstract class ExprMeta[INPUT <: Expression] extends BaseExprMeta[INPUT]
  73. class ExprRule[INPUT <: Expression] extends ReplacementRule[INPUT, Expression, BaseExprMeta[INPUT]]

    Holds everything that is needed to replace an Expression with a GPU enabled version.

  74. sealed abstract class ExpressionContext extends AnyRef
  75. trait ExtraInfo extends AnyRef

    A common trait for the extra information for different file format

  76. class FileFormatChecks extends TypeChecks[SupportLevel]

    Checks for either a read or a write of a given file format.

  77. sealed trait FileFormatOp extends AnyRef
  78. sealed trait FileFormatType extends AnyRef
  79. abstract class FilePartitionReaderBase extends PartitionReader[ColumnarBatch] with Logging with ScanWithMetrics with Arm

    The base class for PartitionReader

  80. abstract class GeneratorExprMeta[INPUT <: Generator] extends ExprMeta[INPUT]
  81. trait GpuAggregateWindowFunction[T <: Aggregation with RollingAggregation[T]] extends Expression with GpuWindowFunction

    GPU Counterpart of AggregateWindowFunction.

    GPU Counterpart of AggregateWindowFunction. On the CPU this would extend DeclarativeAggregate and use the provided methods to build up the expressions need to produce a result. For window operations we do it in a single pass, where all of the data is available so instead we have out own set of expressions.

  82. case class GpuAlias(child: Expression, name: String)(exprId: ExprId = NamedExpression.newExprId, qualifier: Seq[String] = Seq.empty, explicitMetadata: Option[Metadata] = None) extends GpuUnaryExpression with NamedExpression with Product with Serializable
  83. case class GpuAtLeastNNonNulls(n: Int, exprs: Seq[Expression]) extends Expression with GpuExpression with Predicate with Product with Serializable

    A GPU accelerated predicate that is evaluated to be true if there are at least n non-null and non-NaN values.

  84. abstract class GpuBaseAggregateMeta[INPUT <: SparkPlan] extends SparkPlanMeta[INPUT]
  85. trait GpuBaseLimitExec extends SparkPlan with LimitExec with GpuExec

    Helper trait which defines methods that are shared by both GpuLocalLimitExec and GpuGlobalLimitExec.

  86. abstract class GpuBaseWindowExecMeta[WindowExecType <: SparkPlan] extends SparkPlanMeta[WindowExecType] with Logging

    Base class for GPU Execs that implement window functions.

    Base class for GPU Execs that implement window functions. This abstracts the method by which the window function's input expressions, partition specs, order-by specs, etc. are extracted from the specific WindowExecType.

    WindowExecType

    The Exec class that implements window functions (E.g. o.a.s.sql.execution.window.WindowExec.)

  87. case class GpuBatchScanExec(output: Seq[AttributeReference], scan: Scan) extends SparkPlan with DataSourceV2ScanExecBase with GpuExec with Product with Serializable
  88. trait GpuBatchedRunningWindowWithFixer extends AnyRef

    For many operations a running window (unbounded preceding to current row) can process the data without dividing the data up into batches that contain all of the data for a given group by key set.

    For many operations a running window (unbounded preceding to current row) can process the data without dividing the data up into batches that contain all of the data for a given group by key set. Instead we store a small amount of state from a previous result and use it to fix the final result. This is a memory optimization.

  89. trait GpuBinaryExpression extends BinaryExpression with GpuExpression
  90. trait GpuBinaryOperator extends BinaryOperator with GpuBinaryExpression
  91. case class GpuBoundReference(ordinal: Int, dataType: DataType, nullable: Boolean) extends GpuLeafExpression with Product with Serializable
  92. case class GpuBringBackToHost(child: SparkPlan) extends SparkPlan with UnaryExecNode with GpuExec with Product with Serializable

    Pull back any data on the GPU to the host so the host can access it.

  93. abstract class GpuBroadcastJoinMeta[INPUT <: SparkPlan] extends SparkPlanMeta[INPUT]
  94. sealed abstract class GpuBuildSide extends AnyRef

    Spark BuildSide, BuildRight, BuildLeft moved packages in Spark 3.1 so create GPU versions of these that can be agnostic to Spark version.

  95. case class GpuCSVPartitionReaderFactory(sqlConf: SQLConf, broadcastedConf: Broadcast[SerializableConfiguration], dataSchema: StructType, readDataSchema: StructType, partitionSchema: StructType, parsedOptions: CSVOptions, maxReaderBatchSizeRows: Integer, maxReaderBatchSizeBytes: Long, metrics: Map[String, GpuMetric]) extends FilePartitionReaderFactory with Product with Serializable
  96. case class GpuCSVScan(sparkSession: SparkSession, fileIndex: PartitioningAwareFileIndex, dataSchema: StructType, readDataSchema: StructType, readPartitionSchema: StructType, options: CaseInsensitiveStringMap, partitionFilters: Seq[Expression], dataFilters: Seq[Expression], maxReaderBatchSizeRows: Integer, maxReaderBatchSizeBytes: Long) extends TextBasedFileScan with ScanWithMetrics with Product with Serializable
  97. case class GpuCaseWhen(branches: Seq[(Expression, Expression)], elseValue: Option[Expression] = None) extends Expression with GpuConditionalExpression with Serializable with Product
  98. case class GpuCast(child: Expression, dataType: DataType, ansiMode: Boolean = false, timeZoneId: Option[String] = None, legacyCastToString: Boolean = false) extends GpuUnaryExpression with TimeZoneAwareExpression with NullIntolerant with Product with Serializable

    Casts using the GPU

  99. case class GpuCheckOverflow(child: Expression, dataType: DecimalType, nullOnOverflow: Boolean) extends GpuUnaryExpression with Product with Serializable

    A GPU substitution of CheckOverflow, does not actually check for overflow because the precision checks for 64-bit support prevent the need for that.

  100. case class GpuCoalesce(children: Seq[Expression]) extends Expression with GpuExpression with ComplexTypeMergingExpression with Product with Serializable
  101. case class GpuCoalesceBatches(child: SparkPlan, goal: CoalesceGoal) extends SparkPlan with UnaryExecNode with GpuExec with Product with Serializable
  102. case class GpuCoalesceExec(numPartitions: Int, child: SparkPlan) extends SparkPlan with UnaryExecNode with GpuExec with Product with Serializable
  103. class GpuCoalesceIterator extends AbstractGpuCoalesceIterator with Arm
  104. class GpuCollectLimitMeta extends SparkPlanMeta[CollectLimitExec]
  105. class GpuColumnVector extends GpuColumnVectorBase

    A GPU accelerated version of the Spark ColumnVector.

    A GPU accelerated version of the Spark ColumnVector. Most of the standard Spark APIs should never be called, as they assume that the data is on the host, and we want to keep as much of the data on the device as possible. We also provide GPU accelerated versions of the transitions to and from rows.

  106. final class GpuColumnVectorFromBuffer extends GpuColumnVector

    GPU column vector carved from a single buffer, like those from cudf's contiguousSplit.

  107. class GpuColumnarBatchSerializer extends Serializer with Serializable

    Serializer for serializing ColumnarBatchs for use during normal shuffle.

    Serializer for serializing ColumnarBatchs for use during normal shuffle.

    The serialization write path takes the cudf Table that is described by the ColumnarBatch and uses cudf APIs to serialize the data into a sequence of bytes on the host. The data is returned to the Spark shuffle code where it is compressed by the CPU and written to disk.

    The serialization read path is notably different. The sequence of serialized bytes IS NOT deserialized into a cudf Table but rather tracked in host memory by a ColumnarBatch that contains a SerializedTableColumn. During query planning, each GPU columnar shuffle exchange is followed by a GpuShuffleCoalesceExec that expects to receive only these custom batches of SerializedTableColumn. GpuShuffleCoalesceExec coalesces the smaller shuffle partitions into larger tables before placing them on the GPU for further processing.

    Note

    The RAPIDS shuffle does not use this code.

  108. case class GpuColumnarToRowExec(child: SparkPlan, exportColumnarRdd: Boolean = false) extends GpuColumnarToRowExecParent with Product with Serializable
  109. abstract class GpuColumnarToRowExecParent extends SparkPlan with UnaryExecNode with GpuExec
  110. trait GpuComplexTypeMergingExpression extends Expression with ComplexTypeMergingExpression with GpuExpression
  111. final class GpuCompressedColumnVector extends GpuColumnVectorBase with WithTableBuffer

    A column vector that tracks a compressed table.

    A column vector that tracks a compressed table. Unlike a normal GPU column vector, the columnar data within cannot be accessed directly. This class primarily serves the role of tracking the compressed data and table metadata so it can be decompressed later.

  112. trait GpuConditionalExpression extends Expression with ComplexTypeMergingExpression with GpuExpression
  113. class GpuCostModel extends CostModel
  114. class GpuDataSourceRDD extends DataSourceRDD

    A replacement for DataSourceRDD that does NOT compute the bytes read input metric.

    A replacement for DataSourceRDD that does NOT compute the bytes read input metric. DataSourceRDD assumes all reads occur on the task thread, and some GPU input sources use multithreaded readers that cannot generate proper metrics with DataSourceRDD.

    Note

    It is the responsibility of users of this RDD to generate the bytes read input metric explicitly!

  115. trait GpuDataWritingCommand extends LogicalPlan with DataWritingCommand

    An extension of DataWritingCommand that allows columnar execution.

  116. case class GpuDataWritingCommandExec(cmd: GpuDataWritingCommand, child: SparkPlan) extends SparkPlan with UnaryExecNode with GpuExec with Product with Serializable
  117. case class GpuDenseRank(children: Seq[Expression]) extends Expression with GpuRunningWindowFunction with GpuBatchedRunningWindowWithFixer with Product with Serializable

    Dense Rank is a special window operation where it is only supported as a running window.

    Dense Rank is a special window operation where it is only supported as a running window. In cudf it is only supported as a scan and a group by scan.

    children

    the order by columns.

    Note

    this is a running window only operator

  118. trait GpuExec extends SparkPlan with Arm
  119. case class GpuExpandExec(projections: Seq[Seq[Expression]], output: Seq[Attribute], child: SparkPlan) extends SparkPlan with UnaryExecNode with GpuExec with Product with Serializable

    Apply all of the GroupExpressions to every input row, hence we will get multiple output rows for an input row.

    Apply all of the GroupExpressions to every input row, hence we will get multiple output rows for an input row.

    projections

    The group of expressions, all of the group expressions should output the same schema specified bye the parameter output

    output

    Attribute references to Output

    child

    Child operator

  120. class GpuExpandExecMeta extends SparkPlanMeta[ExpandExec]
  121. class GpuExpandIterator extends Iterator[ColumnarBatch] with Arm
  122. case class GpuExplode(child: Expression) extends GpuExplodeBase with Product with Serializable
  123. abstract class GpuExplodeBase extends GpuUnevaluableUnaryExpression with GpuGenerator
  124. trait GpuExpression extends Expression with Arm

    An Expression that cannot be evaluated in the traditional row-by-row sense (hence Unevaluable) but instead can be evaluated on an entire column batch at once.

  125. case class GpuFilterExec(condition: Expression, child: SparkPlan) extends SparkPlan with UnaryExecNode with GpuPredicateHelper with GpuExec with Product with Serializable
  126. case class GpuGenerateExec(generator: GpuGenerator, requiredChildOutput: Seq[Attribute], outer: Boolean, generatorOutput: Seq[Attribute], child: SparkPlan) extends SparkPlan with UnaryExecNode with GpuExec with Product with Serializable
  127. class GpuGenerateExecSparkPlanMeta extends SparkPlanMeta[GenerateExec]
  128. trait GpuGenerator extends Expression with GpuUnevaluable

    GPU overrides of Generator, corporate with GpuGenerateExec.

  129. case class GpuGetJsonObject(json: Expression, path: Expression) extends BinaryExpression with GpuBinaryExpression with ExpectsInputTypes with Product with Serializable
  130. case class GpuGlobalLimitExec(limit: Int, child: SparkPlan) extends SparkPlan with GpuBaseLimitExec with Product with Serializable

    Take the first limit elements of the child's single output partition.

  131. case class GpuHashAggregateExec(requiredChildDistributionExpressions: Option[Seq[Expression]], groupingExpressions: Seq[Expression], aggregateExpressions: Seq[GpuAggregateExpression], aggregateAttributes: Seq[Attribute], resultExpressions: Seq[NamedExpression], child: SparkPlan, configuredTargetBatchSize: Long) extends SparkPlan with UnaryExecNode with GpuExec with Arm with Product with Serializable

    The GPU version of HashAggregateExec

    The GPU version of HashAggregateExec

    requiredChildDistributionExpressions

    this is unchanged by the GPU. It is used in EnsureRequirements to be able to add shuffle nodes

    groupingExpressions

    The expressions that, when applied to the input batch, return the grouping key

    aggregateExpressions

    The GpuAggregateExpression instances for this node

    aggregateAttributes

    References to each GpuAggregateExpression (attribute references)

    resultExpressions

    the expected output expression of this hash aggregate (which this node should project)

    child

    incoming plan (where we get input columns from)

    configuredTargetBatchSize

    user-configured maximum device memory size of a batch

  132. class GpuHashAggregateIterator extends Iterator[ColumnarBatch] with Arm with AutoCloseable with Logging

    Iterator that takes another columnar batch iterator as input and emits new columnar batches that are aggregated based on the specified grouping and aggregation expressions.

    Iterator that takes another columnar batch iterator as input and emits new columnar batches that are aggregated based on the specified grouping and aggregation expressions. This iterator tries to perform a hash-based aggregation but is capable of falling back to a sort-based aggregation which can operate on data that is either larger than can be represented by a cudf column or larger than can fit in GPU memory.

    The iterator starts by pulling all batches from the input iterator, performing an initial projection and aggregation on each individual batch via aggregateInputBatches(). The resulting aggregated batches are cached in memory as spillable batches. Once all input batches have been aggregated, tryMergeAggregatedBatches() is called to attempt a merge of the aggregated batches into a single batch. If this is successful then the resulting batch can be returned, otherwise buildSortFallbackIterator is used to sort the aggregated batches by the grouping keys and performs a final merge aggregation pass on the sorted batches.

  133. class GpuHashAggregateMeta extends GpuBaseAggregateMeta[HashAggregateExec]
  134. case class GpuHashAggregateMetrics(numOutputRows: GpuMetric, numOutputBatches: GpuMetric, numTasksFallBacked: GpuMetric, computeAggTime: GpuMetric, concatTime: GpuMetric, sortTime: GpuMetric, spillCallback: SpillCallback) extends Product with Serializable

    Utility class to hold all of the metrics related to hash aggregation

  135. case class GpuHashPartitioning(expressions: Seq[Expression], numPartitions: Int) extends Expression with GpuExpression with GpuPartitioning with Product with Serializable
  136. case class GpuIf(predicateExpr: Expression, trueExpr: Expression, falseExpr: Expression) extends Expression with GpuConditionalExpression with Product with Serializable
  137. case class GpuInSet(child: Expression, list: Seq[Any]) extends GpuUnaryExpression with Predicate with Product with Serializable
  138. case class GpuIsNan(child: Expression) extends GpuUnaryExpression with Predicate with Product with Serializable
  139. case class GpuIsNotNull(child: Expression) extends GpuUnaryExpression with Predicate with Product with Serializable
  140. case class GpuIsNull(child: Expression) extends GpuUnaryExpression with Predicate with Product with Serializable
  141. class GpuKeyBatchingIterator extends Iterator[ColumnarBatch] with Arm

    Given a stream of data that is sorted by a set of keys, split the data so each batch output contains all of the keys for a given key set.

    Given a stream of data that is sorted by a set of keys, split the data so each batch output contains all of the keys for a given key set. This tries to get the batch sizes close to the target size. It assumes that the input batches will already be close to that size and does not try to split them too much further.

  142. case class GpuKnownFloatingPointNormalized(child: Expression) extends UnaryExpression with TaggingExpression with GpuExpression with Product with Serializable

    This is a TaggingExpression in spark, which gets matched in NormalizeFloatingNumbers (which is a Rule).

  143. case class GpuKnownNotNull(child: Expression) extends UnaryExpression with TaggingExpression with GpuExpression with Product with Serializable

    GPU version of the 'KnownNotNull', a TaggingExpression in spark, to tag an expression as known to not be null.

  144. class GpuKryoRegistrator extends KryoRegistrator
  145. case class GpuLag(input: Expression, offset: Expression, default: Expression) extends Expression with GpuOffsetWindowFunction[LagAggregation] with Product with Serializable
  146. case class GpuLead(input: Expression, offset: Expression, default: Expression) extends Expression with GpuOffsetWindowFunction[LeadAggregation] with Product with Serializable
  147. abstract class GpuLeafExpression extends Expression with GpuExpression
  148. case class GpuLiteral(value: Any, dataType: DataType) extends GpuLeafExpression with Product with Serializable

    In order to do type conversion and checking, use GpuLiteral.create() instead of constructor.

  149. case class GpuLocalLimitExec(limit: Int, child: SparkPlan) extends SparkPlan with GpuBaseLimitExec with Product with Serializable

    Take the first limit elements of each child partition, but do not collect or shuffle them.

  150. case class GpuMakeDecimal(child: Expression, precision: Int, sparkScale: Int, nullOnOverflow: Boolean) extends GpuUnaryExpression with Product with Serializable
  151. sealed abstract class GpuMetric extends Serializable
  152. case class GpuMonotonicallyIncreasingID() extends GpuLeafExpression with Product with Serializable

    An expression that returns monotonically increasing 64-bit integers just like org.apache.spark.sql.catalyst.expressions.MonotonicallyIncreasingID

    An expression that returns monotonically increasing 64-bit integers just like org.apache.spark.sql.catalyst.expressions.MonotonicallyIncreasingID

    The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. This implementations should match what spark does which is to put the partition ID in the upper 31 bits, and the lower 33 bits represent the record number within each partition.

  153. case class GpuNaNvl(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpression with Product with Serializable
  154. trait GpuOffsetWindowFunction[T <: Aggregation with RollingAggregation[T]] extends Expression with GpuAggregateWindowFunction[T]
  155. case class GpuOrcMultiFilePartitionReaderFactory(sqlConf: SQLConf, broadcastedConf: Broadcast[SerializableConfiguration], dataSchema: StructType, readDataSchema: StructType, partitionSchema: StructType, filters: Array[Filter], rapidsConf: RapidsConf, metrics: Map[String, GpuMetric], queryUsesInputFile: Boolean) extends MultiFilePartitionReaderFactoryBase with Product with Serializable

    The multi-file partition reader factory for creating cloud reading or coalescing reading for ORC file format.

    The multi-file partition reader factory for creating cloud reading or coalescing reading for ORC file format.

    sqlConf

    the SQLConf

    broadcastedConf

    the Hadoop configuration

    dataSchema

    schema of the data

    readDataSchema

    the Spark schema describing what will be read

    partitionSchema

    schema of partitions.

    filters

    filters on non-partition columns

    rapidsConf

    the Rapids configuration

    metrics

    the metrics

    queryUsesInputFile

    this is a parameter to easily allow turning it off in GpuTransitionOverrides if InputFileName, InputFileBlockStart, or InputFileBlockLength are used

  156. class GpuOrcPartitionReader extends FilePartitionReaderBase with OrcPartitionReaderBase

    A PartitionReader that reads an ORC file split on the GPU.

    A PartitionReader that reads an ORC file split on the GPU.

    Efficiently reading an ORC split on the GPU requires rebuilding the ORC file in memory such that only relevant data is present in the memory file. This avoids sending unnecessary data to the GPU and saves GPU memory.

  157. case class GpuOrcPartitionReaderFactory(sqlConf: SQLConf, broadcastedConf: Broadcast[SerializableConfiguration], dataSchema: StructType, readDataSchema: StructType, partitionSchema: StructType, pushedFilters: Array[Filter], rapidsConf: RapidsConf, metrics: Map[String, GpuMetric]) extends FilePartitionReaderFactory with Arm with Product with Serializable
  158. abstract class GpuOrcScanBase extends ScanWithMetrics with Logging
  159. case class GpuOutOfCoreSortIterator(iter: Iterator[ColumnarBatch], sorter: GpuSorter, cpuOrd: LazilyGeneratedOrdering, targetSize: Long, totalTime: GpuMetric, sortTime: GpuMetric, outputBatches: GpuMetric, outputRows: GpuMetric, peakDevMemory: GpuMetric, spillCallback: SpillCallback) extends Iterator[ColumnarBatch] with Arm with AutoCloseable with Product with Serializable

    Sorts incoming batches of data spilling if needed.

    Sorts incoming batches of data spilling if needed.
    The algorithm for this is a modified version of an external merge sort with multiple passes for large data. https://en.wikipedia.org/wiki/External_sorting#External_merge_sort
    The main difference is that we cannot stream the data when doing a merge sort. So, we instead divide the data into batches that are small enough that we can do a merge sort on N batches and still fit the output within the target batch size. When merging batches instead of individual rows we cannot assume that all of the resulting data is globally sorted. Hopefully, most of it is globally sorted but we have to use the first row from the next pending batch to determine the cutoff point between globally sorted data and data that still needs to be merged with other batches. The globally sorted portion is put into a sorted queue while the rest of the merged data is split and put back into a pending queue. The process repeats until we have enough data to output.

  160. case class GpuOverrides() extends Rule[SparkPlan] with Logging with Product with Serializable
  161. trait GpuOverridesListener extends AnyRef

    Listener trait so that tests can confirm that the expected optimizations are being applied

  162. final class GpuPackedTableColumn extends GpuColumnVectorBase with WithTableBuffer

    A GPU column tracking a packed table such as one generated by contiguous split.

    A GPU column tracking a packed table such as one generated by contiguous split. Unlike GpuColumnVectorFromBuffer, the columnar data cannot be accessed directly.

    This class primarily serves the role of tracking the packed table data in a ColumnarBatch without requiring the underlying table to be manifested along with all of the child columns. The typical use-case generates one of these columns per task output partition, and then the RAPIDS shuffle transmits the opaque host metadata and GPU data buffer to another host.

    NOTE: There should only be one instance of this column per ColumnarBatch as the

  163. class GpuParquetFileFormat extends ColumnarFileFormat with Logging
  164. case class GpuParquetMultiFilePartitionReaderFactory(sqlConf: SQLConf, broadcastedConf: Broadcast[SerializableConfiguration], dataSchema: StructType, readDataSchema: StructType, partitionSchema: StructType, filters: Array[Filter], rapidsConf: RapidsConf, metrics: Map[String, GpuMetric], queryUsesInputFile: Boolean) extends MultiFilePartitionReaderFactoryBase with Product with Serializable

    Similar to GpuParquetPartitionReaderFactory but extended for reading multiple files in an iteration.

    Similar to GpuParquetPartitionReaderFactory but extended for reading multiple files in an iteration. This will allow us to read multiple small files and combine them on the CPU side before sending them down to the GPU.

  165. case class GpuParquetPartitionReaderFactory(sqlConf: SQLConf, broadcastedConf: Broadcast[SerializableConfiguration], dataSchema: StructType, readDataSchema: StructType, partitionSchema: StructType, filters: Array[Filter], rapidsConf: RapidsConf, metrics: Map[String, GpuMetric]) extends FilePartitionReaderFactory with Arm with Logging with Product with Serializable
  166. abstract class GpuParquetScanBase extends ScanWithMetrics with Logging

    Base GpuParquetScan used for common code across Spark versions.

    Base GpuParquetScan used for common code across Spark versions. Gpu version of Spark's 'ParquetScan'.

  167. class GpuParquetWriter extends ColumnarOutputWriter
  168. trait GpuPartitioning extends Partitioning with Arm
  169. case class GpuPosExplode(child: Expression) extends GpuExplodeBase with Product with Serializable
  170. case class GpuProjectExec(projectList: List[Expression], child: SparkPlan) extends SparkPlan with UnaryExecNode with GpuExec with Product with Serializable
  171. case class GpuPromotePrecision(child: Expression) extends GpuUnaryExpression with Product with Serializable

    A GPU substitution of PromotePrecision, which is a NOOP in Spark too.

  172. case class GpuQueryStagePrepOverrides() extends Rule[SparkPlan] with Logging with Product with Serializable

    Tag the initial plan when AQE is enabled

  173. case class GpuRangeExec(range: Range, targetSizeBytes: Long) extends SparkPlan with LeafExecNode with GpuExec with Product with Serializable

    Physical plan for range (generating a range of 64 bit numbers).

  174. case class GpuRangePartitioner(rangeBounds: Array[InternalRow], sorter: GpuSorter) extends Expression with GpuExpression with GpuPartitioning with Product with Serializable
  175. case class GpuRangePartitioning(gpuOrdering: Seq[SortOrder], numPartitions: Int) extends Expression with GpuExpression with GpuPartitioning with Product with Serializable

    A GPU accelerated org.apache.spark.sql.catalyst.plans.physical.Partitioning that partitions sortable records by range into roughly equal ranges.

    A GPU accelerated org.apache.spark.sql.catalyst.plans.physical.Partitioning that partitions sortable records by range into roughly equal ranges. The ranges are determined by sampling the content of the RDD passed in.

    Note

    The actual number of partitions created might not be the same as the numPartitions parameter, in the case where the number of sampled records is less than the value of partitions. The GpuRangePartitioner is where all of the processing actually happens.

  176. case class GpuRank(children: Seq[Expression]) extends Expression with GpuRunningWindowFunction with GpuBatchedRunningWindowWithFixer with Product with Serializable

    Rank is a special window operation where it is only supported as a running window.

    Rank is a special window operation where it is only supported as a running window. In cudf it is only supported as a scan and a group by scan. But there are special requirements beyond that when doing the computation as a running batch. To fix up each batch it needs both the rank and the row number. To make this work and be efficient there is different behavior for batched running window vs non-batched. If it is for a running batch we include the row number values, in both the initial projections and in the corresponding aggregations. Then we combine them into a struct column in scanCombine before it is passed on to the RankFixer. If it is not a running batch, then we drop the row number part because it is just not needed.

    children

    the order by columns.

    Note

    this is a running window only operator.

  177. class GpuReadCSVFileFormat extends CSVFileFormat with GpuReadFileFormatWithMetrics

    A FileFormat that allows reading CSV files with the GPU.

  178. trait GpuReadFileFormatWithMetrics extends FileFormat
  179. class GpuReadOrcFileFormat extends OrcFileFormat with GpuReadFileFormatWithMetrics

    A FileFormat that allows reading ORC files with the GPU.

  180. class GpuReadParquetFileFormat extends ParquetFileFormat with GpuReadFileFormatWithMetrics

    A FileFormat that allows reading Parquet files with the GPU.

  181. case class GpuRoundRobinPartitioning(numPartitions: Int) extends Expression with GpuExpression with GpuPartitioning with Product with Serializable

    Represents a partitioning where incoming columnar batched rows are distributed evenly across output partitions by starting from a zero-th partition number and distributing rows in a round-robin fashion.

    Represents a partitioning where incoming columnar batched rows are distributed evenly across output partitions by starting from a zero-th partition number and distributing rows in a round-robin fashion. This partitioning is used when implementing the DataFrame.repartition() operator.

  182. case class GpuRowToColumnarExec(child: SparkPlan, goal: CoalesceSizeGoal) extends SparkPlan with UnaryExecNode with GpuExec with Product with Serializable

    GPU version of row to columnar transition.

  183. case class GpuRunningWindowExec(windowOps: Seq[NamedExpression], partitionSpec: Seq[Expression], orderSpec: Seq[SortOrder], child: SparkPlan) extends SparkPlan with GpuWindowBaseExec with Product with Serializable
  184. trait GpuRunningWindowFunction extends Expression with GpuWindowFunction

    A window function that is optimized for running windows using the cudf scan and group by scan operations.

    A window function that is optimized for running windows using the cudf scan and group by scan operations. In some cases, like row number and rank, Spark only supports them as running window operations. This is why it directly extends GpuWindowFunction because it can be a stand alone window function. In all other cases it should be combined with GpuAggregateWindowFunction to provide a fully functional window operation. It should be noted that WindowExec tries to deduplicate input projections and aggregations to reduce memory usage. Because of tracking requirements it is required that there is a one to one relationship between an input projection and a corresponding aggregation.

  185. class GpuRunningWindowIterator extends Iterator[ColumnarBatch] with BasicWindowCalc

    An iterator that can do row based aggregations on running window queries (Unbounded preceding to current row) if and only if the aggregations are instances of GpuBatchedRunningWindowFunction which can fix up the window output when an aggregation is only partly done in one batch of data.

    An iterator that can do row based aggregations on running window queries (Unbounded preceding to current row) if and only if the aggregations are instances of GpuBatchedRunningWindowFunction which can fix up the window output when an aggregation is only partly done in one batch of data. Because of this there is no requirement about how the input data is batched, but it must be sorted by both partitioning and ordering.

  186. class GpuScalar extends Arm with AutoCloseable

    The wrapper of a Scala value and its corresponding cudf Scalar, along with its DataType.

    The wrapper of a Scala value and its corresponding cudf Scalar, along with its DataType.

    This class is introduced because many expressions require both the cudf Scalar and its corresponding Scala value to complete their computations. e.g. 'GpuStringSplit', 'GpuStringLocate', 'GpuDivide', 'GpuDateAddInterval', 'GpuTimeMath' ... So only either a cudf Scalar or a Scala value can not support such cases, unless copying data between the host and the device each time being asked for.

    This GpuScalar can be created from either a cudf Scalar or a Scala value. By initializing the cudf Scalar or the Scala value lazily and caching them after being created, it can reduce the unnecessary data copies.

    If a GpuScalar is created from a Scala value and is used only on the host side, there will be no data copy and no cudf Scalar created. And if it is used on the device side, only need to copy data to the device once to create a cudf Scalar.

    Similarly, if a GpuScalar is created from a cudf Scalar, no need to copy data to the host if it is used only on the device side (This is the ideal case we like, since all is on the GPU). And only need to copy the data to the host once if it is used on the host side.

    So a GpuScalar will have at most one data copy but support all the cases. No round-trip happens.

    Another reason why storing the Scala value in addition to the cudf Scalar is GpuDateAddInterval and 'GpuTimeMath' have different algorithms with the 3 members of a CalendarInterval, which can not be supported by a single cudf Scalar now.

    Do not create a GpuScalar from the constructor, instead call the factory APIs above.

  187. case class GpuShuffleCoalesceExec(child: SparkPlan, targetBatchByteSize: Long) extends SparkPlan with UnaryExecNode with GpuExec with Product with Serializable

    Coalesces serialized tables on the host up to the target batch size before transferring the coalesced result to the GPU.

    Coalesces serialized tables on the host up to the target batch size before transferring the coalesced result to the GPU. This reduces the overhead of copying data to the GPU and also helps avoid holding onto the GPU semaphore while shuffle I/O is being performed.

    Note

    This should ALWAYS appear in the plan after a GPU shuffle when RAPIDS shuffle is not being used.

  188. class GpuShuffleCoalesceIterator extends Iterator[ColumnarBatch] with Arm with AutoCloseable

    Iterator that coalesces columnar batches that are expected to only contain SerializedTableColumn.

    Iterator that coalesces columnar batches that are expected to only contain SerializedTableColumn. The serialized tables within are collected up to the target batch size and then concatenated on the host before the data is transferred to the GPU.

  189. abstract class GpuShuffledHashJoinBase extends SparkPlan with BinaryExecNode with GpuHashJoin
  190. class GpuSortAggregateMeta extends GpuBaseAggregateMeta[SortAggregateExec]
  191. case class GpuSortEachBatchIterator(iter: Iterator[ColumnarBatch], sorter: GpuSorter, singleBatch: Boolean, totalTime: GpuMetric = NoopMetric, sortTime: GpuMetric = NoopMetric, outputBatches: GpuMetric = NoopMetric, outputRows: GpuMetric = NoopMetric, peakDevMemory: GpuMetric = NoopMetric) extends Iterator[ColumnarBatch] with Arm with Product with Serializable
  192. case class GpuSortExec(sortOrder: Seq[SortOrder], global: Boolean, child: SparkPlan, sortType: SortExecType) extends SparkPlan with UnaryExecNode with GpuExec with Product with Serializable
  193. class GpuSortMeta extends SparkPlanMeta[SortExec]
  194. class GpuSorter extends Arm with Serializable

    A class that provides convenience methods for sorting batches of data.

    A class that provides convenience methods for sorting batches of data. A Spark SortOrder typically will just reference a single column using an AttributeReference. This is the simplest situation so we just need to bind the attribute references to where they go, but it is possible that some computation can be done in the SortOrder. This would be a situation like sorting strings by their length instead of in lexicographical order. Because cudf does not support this directly we instead go through the SortOrder instances that are a part of this sorter and find the ones that require computation. We then do the sort in a few stages first we compute any needed columns from the SortOrder instances that require some computation, and add them to the original batch. The method appendProjectedColumns does this. This then provides a number of methods that can be used to operate on a batch that has these new columns added to it. These include sorting, merge sorting, and finding bounds. These can be combined in various ways to do different algorithms. When you are done with these different operations you can drop the temporary columns that were added, just for computation, using removeProjectedColumns. Some times you may want to pull data back to the CPU and sort rows there too. We provide cpuOrders that lets you do this on rows that have had the extra ordering columns added to them. This also provides fullySortBatch as an optimization. If all you want to do is sort a batch you don't want to have to sort the temp columns too, and this provide that.

  195. case class GpuSparkPartitionID() extends GpuLeafExpression with Product with Serializable

    An expression that returns the current partition id just like org.apache.spark.sql.catalyst.expressions.SparkPartitionID

  196. case class GpuSpecialFrameBoundary(boundary: SpecialFrameBoundary) extends Expression with GpuExpression with GpuUnevaluable with Product with Serializable
  197. case class GpuSpecifiedWindowFrame(frameType: FrameType, lower: Expression, upper: Expression) extends Expression with GpuWindowFrame with Product with Serializable
  198. class GpuSpecifiedWindowFrameMeta extends ExprMeta[SpecifiedWindowFrame]
  199. trait GpuString2TrimExpression extends Expression with String2TrimExpression with GpuExpression
  200. trait GpuTernaryExpression extends TernaryExpression with GpuExpression
  201. case class GpuTopN(limit: Int, sortOrder: Seq[SortOrder], projectList: Seq[NamedExpression], child: SparkPlan) extends SparkPlan with GpuExec with UnaryExecNode with Product with Serializable

    Take the first limit elements as defined by the sortOrder, and do projection if needed.

    Take the first limit elements as defined by the sortOrder, and do projection if needed. This is logically equivalent to having a Limit operator after a SortExec operator, or having a ProjectExec operator between them. This could have been named TopK, but Spark's top operator does the opposite in ordering so we name it TakeOrdered to avoid confusion.

  202. class GpuTransitionOverrides extends Rule[SparkPlan]

    Rules that run after the row to columnar and columnar to row transitions have been inserted.

    Rules that run after the row to columnar and columnar to row transitions have been inserted. These rules insert transitions to and from the GPU, and then optimize various transitions.

  203. abstract class GpuUnaryExpression extends UnaryExpression with GpuExpression
  204. trait GpuUnevaluable extends Expression with GpuExpression
  205. abstract class GpuUnevaluableUnaryExpression extends GpuUnaryExpression with GpuUnevaluable
  206. case class GpuUnionExec(children: Seq[SparkPlan]) extends SparkPlan with GpuExec with Product with Serializable
  207. case class GpuUnscaledValue(child: Expression) extends GpuUnaryExpression with Product with Serializable
  208. trait GpuUserDefinedFunction extends Expression with GpuExpression with UserDefinedExpression with Serializable

    Common implementation across all RAPIDS accelerated UDF types

  209. trait GpuWindowBaseExec extends SparkPlan with UnaryExecNode with GpuExec
  210. case class GpuWindowExec(windowOps: Seq[NamedExpression], partitionSpec: Seq[Expression], orderSpec: Seq[SortOrder], child: SparkPlan) extends SparkPlan with GpuWindowBaseExec with Product with Serializable
  211. class GpuWindowExecMeta extends GpuBaseWindowExecMeta[WindowExec]

    Specialization of GpuBaseWindowExecMeta for org.apache.spark.sql.window.WindowExec.

    Specialization of GpuBaseWindowExecMeta for org.apache.spark.sql.window.WindowExec. This class implements methods to extract the window-expressions, partition columns, order-by columns, etc. from WindowExec.

  212. case class GpuWindowExpression(windowFunction: Expression, windowSpec: GpuWindowSpecDefinition) extends Expression with GpuUnevaluable with Product with Serializable
  213. class GpuWindowExpressionMeta extends ExprMeta[WindowExpression]
  214. trait GpuWindowFrame extends Expression with GpuExpression with GpuUnevaluable
  215. trait GpuWindowFunction extends Expression with GpuUnevaluable
  216. class GpuWindowIterator extends Iterator[ColumnarBatch] with BasicWindowCalc

    An Iterator that performs window operations on the input data.

    An Iterator that performs window operations on the input data. It is required that the input data is batched so all of the data for a given key is in the same batch. The input data must also be sorted by both partition by keys and order by keys.

  217. case class GpuWindowSpecDefinition(partitionSpec: Seq[Expression], orderSpec: Seq[SortOrder], frameSpecification: GpuWindowFrame) extends Expression with GpuExpression with GpuUnevaluable with Product with Serializable
  218. class GpuWindowSpecDefinitionMeta extends ExprMeta[WindowSpecDefinition]
  219. class GroupedAggregations extends Arm

    Window aggregations that are grouped together.

    Window aggregations that are grouped together. It holds the aggregation and the offsets of its input columns, along with the output columns it should write the result to.

  220. final class HashedPriorityQueue[T] extends AbstractQueue[T]

    Implements a priority queue based on a heap.

    Implements a priority queue based on a heap. Like many priority queue implementations, this provides logarithmic time for inserting elements and removing the top element. However unlike many implementations, this provides logarithmic rather than linear time for the random-access contains and remove methods. The queue also provides a mechanism for updating the heap after an element's priority has changed via the priorityUpdated method instead of requiring the element to be removed and re-inserted.

    The queue is NOT thread-safe.

    The iterator does NOT return elements in priority order.

  221. class HostByteBufferIterator extends Iterator[ByteBuffer]

    Create an iterator that will emit ByteBuffer instances sequentially to work around the 2GB ByteBuffer size limitation.

    Create an iterator that will emit ByteBuffer instances sequentially to work around the 2GB ByteBuffer size limitation. This allows the entire address range of a >2GB host buffer to be covered by a sequence of ByteBuffer instances.

    NOTE: It is the caller's responsibility to ensure this iterator does not outlive the host buffer. The iterator DOES NOT increment the reference count of the host buffer to ensure it remains valid.

    returns

    ByteBuffer iterator

  222. case class HostColumnarToGpu(child: SparkPlan, goal: CoalesceSizeGoal) extends SparkPlan with UnaryExecNode with GpuExec with Product with Serializable

    Put columnar formatted data on the GPU.

  223. trait HostMemoryBuffersWithMetaDataBase extends AnyRef

    The base HostMemoryBuffer information read from a single file.

  224. class HostMemoryInputStream extends InputStream

    An implementation of InputStream that reads from a HostMemoryBuffer.

    An implementation of InputStream that reads from a HostMemoryBuffer.

    NOTE: Closing this input stream does NOT close the buffer!

  225. class HostMemoryOutputStream extends OutputStream

    An implementation of OutputStream that writes to a HostMemoryBuffer.

    An implementation of OutputStream that writes to a HostMemoryBuffer.

    NOTE: Closing this output stream does NOT close the buffer!

  226. class HostToGpuCoalesceIterator extends AbstractGpuCoalesceIterator

    This iterator builds GPU batches from host batches.

    This iterator builds GPU batches from host batches. The host batches potentially use Spark's UnsafeRow so it is not safe to cache these batches. Rows must be read and immediately written to CuDF builders.

  227. abstract class ImperativeAggExprMeta[INPUT <: ImperativeAggregate] extends ExprMeta[INPUT]

    Base class for metadata around ImperativeAggregate.

  228. final class InsertIntoHadoopFsRelationCommandMeta extends DataWritingCommandMeta[InsertIntoHadoopFsRelationCommand]
  229. trait JoinGatherer extends LazySpillable with Arm

    Generic trait for all join gather instances.

    Generic trait for all join gather instances. A JoinGatherer takes the gather maps that are the result of a cudf join call along with the data batches that need to be gathered and allow someone to materialize the join in batches. It also provides APIs to help decide on how many rows to gather.

    This is a LazySpillable instance so the life cycle follows that too.

  230. class JoinGathererImpl extends JoinGatherer

    JoinGatherer for a single map/table

  231. class JustRowsColumnarBatch extends SpillableColumnarBatch

    Cudf does not support a table with columns and no rows.

    Cudf does not support a table with columns and no rows. This takes care of making one of those spillable, even though in reality there is no backing buffer. It does this by just keeping the row count in memory, and not dealing with the catalog at all.

  232. trait LazySpillable extends AutoCloseable

    Holds something that can be spilled if it is marked as such, but it does not modify the data until it is ready to be spilled.

    Holds something that can be spilled if it is marked as such, but it does not modify the data until it is ready to be spilled. This avoids the performance penalty of making reformatting the underlying data so it is ready to be spilled.

    Call allowSpilling to indicate that the data can be released for spilling and call close to indicate that the data is not needed any longer.

    If the data is needed after allowSpilling is called the implementations should get the data back and cache it again until allowSpilling is called once more.

  233. trait LazySpillableColumnarBatch extends LazySpillable

    Holds a Columnar batch that is LazySpillable.

  234. class LazySpillableColumnarBatchImpl extends LazySpillableColumnarBatch with Arm

    Holds a columnar batch that is cached until it is marked that it can be spilled.

  235. trait LazySpillableGatherMap extends LazySpillable with Arm
  236. class LazySpillableGatherMapImpl extends LazySpillableGatherMap

    Holds a gather map that is also lazy spillable.

  237. class LeftCrossGatherMap extends BaseCrossJoinGatherMap
  238. class LiteralExprMeta extends ExprMeta[Literal]
  239. sealed trait MemoryState extends AnyRef
  240. class MetricRange extends AutoCloseable
  241. sealed class MetricsLevel extends Serializable
  242. class MultiFileCloudOrcPartitionReader extends MultiFileCloudPartitionReaderBase with MultiFileReaderFunctions with OrcPartitionReaderBase

    A PartitionReader that can read multiple ORC files in parallel.

    A PartitionReader that can read multiple ORC files in parallel. This is most efficient running in a cloud environment where the I/O of reading is slow.

    Efficiently reading a ORC split on the GPU requires re-constructing the ORC file in memory that contains just the Stripes that are needed. This avoids sending unnecessary data to the GPU and saves GPU memory.

  243. class MultiFileCloudParquetPartitionReader extends MultiFileCloudPartitionReaderBase with ParquetPartitionReaderBase

    A PartitionReader that can read multiple Parquet files in parallel.

    A PartitionReader that can read multiple Parquet files in parallel. This is most efficient running in a cloud environment where the I/O of reading is slow.

    Efficiently reading a Parquet split on the GPU requires re-constructing the Parquet file in memory that contains just the column chunks that are needed. This avoids sending unnecessary data to the GPU and saves GPU memory.

  244. abstract class MultiFileCloudPartitionReaderBase extends FilePartitionReaderBase

    The Abstract multi-file cloud reading framework

    The Abstract multi-file cloud reading framework

    The data driven: next() -> if (first time) initAndStartReaders -> submit tasks (getBatchRunner) -> wait tasks done sequentially -> decode in GPU (readBatch)

  245. abstract class MultiFileCoalescingPartitionReaderBase extends FilePartitionReaderBase with MultiFileReaderFunctions

    The abstracted multi-file coalescing reading class, which tries to coalesce small ColumnarBatch into a bigger ColumnarBatch according to maxReadBatchSizeRows, maxReadBatchSizeBytes and the checkIfNeedToSplitDataBlock.

    The abstracted multi-file coalescing reading class, which tries to coalesce small ColumnarBatch into a bigger ColumnarBatch according to maxReadBatchSizeRows, maxReadBatchSizeBytes and the checkIfNeedToSplitDataBlock.

    Please be note, this class is applied to below similar file format

    | HEADER | -> optional

    | block | -> repeated

    | FOOTER | -> optional

    The data driven:

    next() -> populateCurrentBlockChunk (try the best to coalesce ColumnarBatch) -> allocate a bigger HostMemoryBuffer for HEADER + the populated block chunks + FOOTER -> write header to HostMemoryBuffer -> launch tasks to copy the blocks to the HostMemoryBuffer -> wait all tasks finished -> write footer to HostMemoryBuffer -> decode the HostMemoryBuffer in the GPU

  246. class MultiFileOrcPartitionReader extends MultiFileCoalescingPartitionReaderBase with OrcCommonFunctions

  247. class MultiFileParquetPartitionReader extends MultiFileCoalescingPartitionReaderBase with ParquetPartitionReaderBase

    A PartitionReader that can read multiple Parquet files up to the certain size.

    A PartitionReader that can read multiple Parquet files up to the certain size. It will coalesce small files together and copy the block data in a separate thread pool to speed up processing the small files before sending down to the GPU.

    Efficiently reading a Parquet split on the GPU requires re-constructing the Parquet file in memory that contains just the column chunks that are needed. This avoids sending unnecessary data to the GPU and saves GPU memory.

  248. abstract class MultiFilePartitionReaderFactoryBase extends PartitionReaderFactory with Arm with Logging

    The base multi-file partition reader factory to create the cloud reading or coalescing reading respectively.

  249. trait MultiFileReaderFunctions extends Arm
  250. case class MultiJoinGather(left: JoinGatherer, right: JoinGatherer) extends JoinGatherer with Product with Serializable

    Join Gatherer for a left table and a right table

  251. final class NoRuleDataFromReplacementRule extends DataFromReplacementRule

    A version of DataFromReplacementRule that is used when no replacement rule can be found.

  252. class NvcompLZ4CompressionCodec extends TableCompressionCodec with Arm

    A table compression codec that uses nvcomp's LZ4-GPU codec

  253. class NvtxWithMetrics extends NvtxRange

    NvtxRange with option to pass one or more nano timing metric(s) that are updated upon close by the amount of time spent in the range

  254. abstract class OffsetWindowFunctionMeta[INPUT <: OffsetWindowFunction] extends ExprMeta[INPUT]
  255. sealed abstract class Optimization extends AnyRef
  256. trait Optimizer extends AnyRef

    Optimizer that can operate on a physical query plan.

  257. class OptionalConfEntry[T] extends ConfEntry[Option[T]]
  258. trait OrcCodecWritingHelper extends Arm
  259. trait OrcCommonFunctions extends OrcCodecWritingHelper

    Collections of some common functions for ORC

  260. case class OrcExtraInfo(requestedMapping: Option[Array[Int]]) extends ExtraInfo with Product with Serializable

    Orc extra information containing the requested column ids for the current coalescing stripes

  261. case class OrcOutputStripe(infoBuilder: Builder, footer: StripeFooter, inputDataRanges: DiskRangeList) extends Product with Serializable

    This class describes a stripe that will appear in the ORC output memory file.

    This class describes a stripe that will appear in the ORC output memory file.

    infoBuilder

    builder for output stripe info that has been populated with all fields except those that can only be known when the file is being written (e.g.: file offset, compressed footer length)

    footer

    stripe footer

    inputDataRanges

    input file ranges (based at file offset 0) of stripe data

  262. trait OrcPartitionReaderBase extends OrcCommonFunctions with Logging with Arm with ScanWithMetrics

    A base ORC partition reader which compose of some common methods

  263. case class OrcPartitionReaderContext(filePath: Path, conf: Configuration, fileSchema: TypeDescription, updatedReadSchema: TypeDescription, evolution: SchemaEvolution, fileTail: FileTail, compressionSize: Int, compressionKind: CompressionKind, readerOpts: Options, blockIterator: BufferedIterator[OrcOutputStripe], requestedMapping: Option[Array[Int]]) extends Product with Serializable

    This class holds fields needed to read and iterate over the OrcFile

    This class holds fields needed to read and iterate over the OrcFile

    filePath

    ORC file path

    conf

    the Hadoop configuration

    fileSchema

    the schema of the whole ORC file

    updatedReadSchema

    read schema mapped to the file's field names

    evolution

    infer and track the evolution between the schema as stored in the file and the schema that has been requested by the reader.

    fileTail

    the ORC FileTail

    compressionSize

    the ORC compression size

    compressionKind

    the ORC compression type

    readerOpts

    options for creating a RecordReader.

    blockIterator

    an iterator over the ORC output stripes

    requestedMapping

    the optional requested column ids

  264. case class OrcStripeWithMeta(stripe: OrcOutputStripe, ctx: OrcPartitionReaderContext) extends Product with Serializable
  265. case class OutOfCoreBatch(buffer: SpillableColumnarBatch, firstRow: UnsafeRow) extends AutoCloseable with Product with Serializable

    Holds data for the out of core sort.

    Holds data for the out of core sort. It includes the batch of data and the first row in that batch so we can sort the batches.

  266. case class ParamCheck(name: String, cudf: TypeSig, spark: TypeSig) extends Product with Serializable

    Checks a single parameter TypeSig

  267. case class ParquetExtraInfo(isCorrectedRebaseMode: Boolean) extends ExtraInfo with Product with Serializable

    Parquet extra information containing isCorrectedRebaseMode

  268. class ParquetPartitionReader extends FilePartitionReaderBase with ParquetPartitionReaderBase

    A PartitionReader that reads a Parquet file split on the GPU.

    A PartitionReader that reads a Parquet file split on the GPU.

    Efficiently reading a Parquet split on the GPU requires re-constructing the Parquet file in memory that contains just the column chunks that are needed. This avoids sending unnecessary data to the GPU and saves GPU memory.

  269. trait ParquetPartitionReaderBase extends Logging with Arm with ScanWithMetrics with MultiFileReaderFunctions
  270. case class ParsedBoundary(isUnbounded: Boolean, valueAsLong: Long) extends Product with Serializable
  271. abstract class PartChecks extends TypeChecks[Map[String, SupportLevel]]

    Base class all Partition checks must follow

  272. case class PartChecksImpl(paramCheck: Seq[ParamCheck] = Seq.empty, repeatingParamCheck: Option[RepeatingParamCheck] = None) extends PartChecks with Product with Serializable
  273. abstract class PartMeta[INPUT <: Partitioning] extends RapidsMeta[INPUT, Partitioning, GpuPartitioning]

    Base class for metadata around Partitioning.

  274. class PartRule[INPUT <: Partitioning] extends ReplacementRule[INPUT, Partitioning, PartMeta[INPUT]]

    Holds everything that is needed to replace a Partitioning with a GPU enabled version.

  275. class PartiallySupported extends SupportLevel

    The plugin partially supports this type.

  276. class PartitionReaderIterator extends Iterator[ColumnarBatch] with AutoCloseable

    An adaptor class that provides an Iterator interface for a PartitionReader.

  277. class PartitionReaderWithBytesRead extends PartitionReader[ColumnarBatch]

    Wraps a columnar PartitionReader to update bytes read metric based on filesystem statistics.

  278. class Pending extends AutoCloseable

    Data that the out of core sort algorithm has not finished sorting.

    Data that the out of core sort algorithm has not finished sorting. This acts as a priority queue with each batch sorted by the first row in that batch.

  279. class PluginException extends RuntimeException
  280. class RankFixer extends BatchedRunningWindowFixer with Arm with Logging

    Rank is more complicated than DenseRank to fix.

    Rank is more complicated than DenseRank to fix. This is because there are gaps in the rank values. The rank value of each group is row number of the first row in the group. So values in the same partition group but not the same ordering are fixed by adding the row number from the previous batch to them. If they are a part of the same ordering and part of the same partition, then we need to just put in the previous rank value.

    Because we need both a rank and a row number to fix things up the input to this is a struct containing a rank column as the first entry and a row number column as the second entry. This happens in the scanCombine method for GpuRank. It is a little ugly but it works to maintain the requirement that the input to the fixer is a single column.

  281. trait RapidsBuffer extends AutoCloseable

    Interface provided by all types of RAPIDS buffers

  282. class RapidsBufferCatalog extends Logging

    Catalog for lookup of buffers by ID.

    Catalog for lookup of buffers by ID. The constructor is only visible for testing, generally RapidsBufferCatalog.singleton should be used instead.

  283. trait RapidsBufferId extends AnyRef

    An identifier for a RAPIDS buffer that can be automatically spilled between buffer stores.

    An identifier for a RAPIDS buffer that can be automatically spilled between buffer stores. NOTE: Derived classes MUST implement proper hashCode and equals methods, as these objects are used as keys in hash maps. Scala case classes are recommended.

  284. abstract class RapidsBufferStore extends AutoCloseable with Logging with Arm

    Base class for all buffer store types.

  285. class RapidsConf extends Logging
  286. class RapidsDeviceMemoryStore extends RapidsBufferStore with Arm

    Buffer storage using device memory.

  287. class RapidsDiskStore extends RapidsBufferStore

    A buffer store using files on the local disks.

  288. class RapidsDriverPlugin extends DriverPlugin with Logging

    The Spark driver plugin provided by the RAPIDS Spark plugin.

  289. case class RapidsExecutorHeartbeatMsg(id: BlockManagerId) extends Product with Serializable

    Executor heartbeat message.

    Executor heartbeat message. This gives the driver an opportunity to respond with RapidsExecutorUpdateMsg

  290. class RapidsExecutorPlugin extends ExecutorPlugin with Logging

    The Spark executor plugin provided by the RAPIDS Spark plugin.

  291. case class RapidsExecutorStartupMsg(id: BlockManagerId) extends Product with Serializable

    This is the first message sent from the executor to the driver.

    This is the first message sent from the executor to the driver.

    id

    BlockManagerId for the executor

  292. case class RapidsExecutorUpdateMsg(ids: Array[BlockManagerId]) extends Product with Serializable

    Driver response to an startup or heartbeat message, with new (to the peer) executors from the last heartbeat.

  293. class RapidsGdsStore extends RapidsBufferStore with Arm

    A buffer store using GPUDirect Storage (GDS).

    A buffer store using GPUDirect Storage (GDS).

    GDS is more efficient when IO is aligned.

    An IO is unaligned if one of the following conditions is true: - The file_offset that was issued in cuFileRead/cuFileWrite is not 4K aligned. - The size that was issued in cuFileRead/cuFileWrite is not 4K aligned. - The devPtr_base that was issued in cuFileRead/cuFileWrite is not 4K aligned. - The devPtr_offset that was issued in cuFileRead/cuFileWrite is not 4K aligned.

    To avoid unaligned IO, when GDS spilling is enabled, the RMM aligned_resource_adapter is used so that large buffers above certain size threshold are allocated with 4K aligned base pointer and size.

    When reading and writing these large buffers through GDS, the size is aligned up to the next 4K boundary. Although the aligned size appears to be out of bound, the extra space needed is held in reserve by the RMM aligned_resource_adapter.

  294. final class RapidsHostColumnVector extends RapidsHostColumnVectorCore

    A GPU accelerated version of the Spark ColumnVector.

    A GPU accelerated version of the Spark ColumnVector. Most of the standard Spark APIs should never be called, as they assume that the data is on the host, and we want to keep as much of the data on the device as possible. We also provide GPU accelerated versions of the transitions to and from rows.

  295. class RapidsHostColumnVectorCore extends ColumnVector

    A GPU accelerated version of the Spark ColumnVector.

    A GPU accelerated version of the Spark ColumnVector. Most of the standard Spark APIs should never be called, as they assume that the data is on the host, and we want to keep as much of the data on the device as possible. We also provide GPU accelerated versions of the transitions to and from rows.

  296. class RapidsHostMemoryStore extends RapidsBufferStore

    A buffer store using host memory.

  297. abstract class RapidsMeta[INPUT <: BASE, BASE, OUTPUT <: BASE] extends AnyRef

    Holds metadata about a stage in the physical plan that is separate from the plan itself.

    Holds metadata about a stage in the physical plan that is separate from the plan itself. This is helpful in deciding when to replace part of the plan with a GPU enabled version.

    INPUT

    the exact type of the class we are wrapping.

    BASE

    the generic base class for this type of stage, i.e. SparkPlan, Expression, etc.

    OUTPUT

    when converting to a GPU enabled version of the plan, the generic base type for all GPU enabled versions.

  298. class RapidsShuffleHeartbeatEndpoint extends Logging with AutoCloseable
  299. trait RapidsShuffleHeartbeatHandler extends AnyRef
  300. class RapidsShuffleHeartbeatManager extends Logging
  301. case class RepeatingParamCheck(name: String, cudf: TypeSig, spark: TypeSig) extends Product with Serializable

    Checks the type signature for a parameter that repeats (Can only be used at the end of a list of parameters)

  302. case class ReplaceSection[INPUT <: SparkPlan](plan: SparkPlanMeta[INPUT], totalCpuCost: Double, totalGpuCost: Double) extends Optimization with Product with Serializable
  303. abstract class ReplacementRule[INPUT <: BASE, BASE, WRAP_TYPE <: RapidsMeta[INPUT, BASE, _]] extends DataFromReplacementRule

    Base class for all ReplacementRules

    Base class for all ReplacementRules

    INPUT

    the exact type of the class we are wrapping.

    BASE

    the generic base class for this type of stage, i.e. SparkPlan, Expression, etc.

    WRAP_TYPE

    base class that should be returned by doWrap.

  304. class RightCrossGatherMap extends BaseCrossJoinGatherMap
  305. class RowToColumnarIterator extends Iterator[ColumnarBatch] with Arm
  306. final class RuleNotFoundDataWritingCommandMeta[INPUT <: DataWritingCommand] extends DataWritingCommandMeta[INPUT]

    Metadata for DataWritingCommand with no rule found

  307. final class RuleNotFoundExprMeta[INPUT <: Expression] extends ExprMeta[INPUT]

    Metadata for Expression with no rule found

  308. final class RuleNotFoundPartMeta[INPUT <: Partitioning] extends PartMeta[INPUT]

    Metadata for Partitioning]] with no rule found

  309. final class RuleNotFoundScanMeta[INPUT <: Scan] extends ScanMeta[INPUT]

    Metadata for Scan with no rule found

  310. final class RuleNotFoundSparkPlanMeta[INPUT <: SparkPlan] extends SparkPlanMeta[INPUT]

    Metadata for SparkPlan with no rule found

  311. class SQLExecPlugin extends (SparkSessionExtensions) ⇒ Unit with Logging

    Extension point to enable GPU SQL processing.

  312. abstract class ScanMeta[INPUT <: Scan] extends RapidsMeta[INPUT, Scan, Scan]

    Base class for metadata around Scan.

  313. class ScanRule[INPUT <: Scan] extends ReplacementRule[INPUT, Scan, ScanMeta[INPUT]]

    Holds everything that is needed to replace a Scan with a GPU enabled version.

  314. trait ScanWithMetrics extends AnyRef
  315. trait SchemaBase extends AnyRef

    A common trait for different schema in the MultiFileCoalescingPartitionReaderBase.

    A common trait for different schema in the MultiFileCoalescingPartitionReaderBase.

    The sub-class should wrap the real schema for the specific file format

  316. class SerializedTableColumn extends GpuColumnVectorBase

    A special ColumnVector that describes a serialized table read from shuffle.

    A special ColumnVector that describes a serialized table read from shuffle. This appears in a ColumnarBatch to pass serialized tables to GpuShuffleCoalesceExec which should always appear in the query plan immediately after a shuffle.

  317. sealed abstract class ShimVersion extends AnyRef
  318. class ShuffleBufferCatalog extends Arm with Logging

    Catalog for lookup of shuffle buffers by block ID

  319. case class ShuffleBufferId(blockId: ShuffleBlockId, tableId: Int) extends RapidsBufferId with Product with Serializable

    Identifier for a shuffle buffer that holds the data for a table

  320. class ShuffleReceivedBufferCatalog extends Logging

    Catalog for lookup of shuffle buffers by block ID

  321. case class ShuffleReceivedBufferId(tableId: Int) extends RapidsBufferId with Product with Serializable

    Identifier for a shuffle buffer that holds the data for a table on the read side

  322. trait SingleDataBlockInfo extends AnyRef

    A single block info of a file, Eg, A parquet file has 3 RowGroup, then it will produce 3 SingleBlockInfoWithMeta

  323. class SlicedGpuColumnVector extends ColumnVector

    Wraps a GpuColumnVector but only points to a slice of it.

    Wraps a GpuColumnVector but only points to a slice of it. This is intended to only be used during shuffle after the data is partitioned and before it is serialized.

  324. sealed trait SortExecType extends Serializable
  325. abstract class SparkPlanMeta[INPUT <: SparkPlan] extends RapidsMeta[INPUT, SparkPlan, GpuExec]

    Base class for metadata around SparkPlan.

  326. trait SparkShimServiceProvider extends AnyRef

    A Spark version shim layer interface.

  327. case class SparkShimVersion(major: Int, minor: Int, patch: Int) extends ShimVersion with Product with Serializable
  328. trait SparkShims extends AnyRef
  329. class SpillableBuffer extends AutoCloseable with Arm

    Just like a SpillableColumnarBatch but for buffers.

  330. trait SpillableColumnarBatch extends AutoCloseable

    Holds a ColumnarBatch that the backing buffers on it can be spilled.

  331. class SpillableColumnarBatchImpl extends SpillableColumnarBatch with Arm

    The implementation of SpillableColumnarBatch that points to buffers that can be spilled.

    The implementation of SpillableColumnarBatch that points to buffers that can be spilled.

    Note

    the buffer should be in the cache by the time this is created and this is taking over ownership of the life cycle of the batch. So don't call this constructor directly please use SpillableColumnarBatch.apply instead.

  332. abstract class String2TrimExpressionMeta[INPUT <: String2TrimExpression] extends ExprMeta[INPUT]
  333. class SumBinaryFixer extends BatchedRunningWindowFixer with Arm with Logging

    This class fixes up batched running windows for sum.

    This class fixes up batched running windows for sum. Sum is a lot like other binary op fixers, but it has to special case nulls and that is not super generic. In the future we might be able to make this more generic but we need to see what the use case really is.

  334. sealed abstract class SupportLevel extends AnyRef

    The level of support that the plugin has for a given type.

    The level of support that the plugin has for a given type. Used for documentation generation.

  335. class Supported extends SupportLevel

    Both Spark and the plugin support this.

  336. trait TableCompressionCodec extends AnyRef

    An interface to a compression codec that can compress a contiguous Table on the GPU

  337. case class TargetSize(targetSizeBytes: Long) extends CoalesceSizeGoal with Product with Serializable

    Produce a stream of batches that are at most the given size in bytes.

    Produce a stream of batches that are at most the given size in bytes. The size is estimated in some cases so it may go over a little, but it should generally be very close to the target size. Generally you should not go over 2 GiB to avoid limitations in cudf for nested type columns.

    targetSizeBytes

    the size of each batch in bytes.

  338. abstract class TernaryExprMeta[INPUT <: TernaryExpression] extends ExprMeta[INPUT]

    Base class for metadata around TernaryExpression.

  339. abstract class TypeChecks[RET] extends AnyRef
  340. final class TypeSig extends AnyRef

    A type signature.

    A type signature. This is a bit limited in what it supports right now, but can express a set of base types and a separate set of types that can be nested under the base types. It can also express if a particular base type has to be a literal or not.

  341. class TypedConfBuilder[T] extends AnyRef
  342. abstract class UnaryExprMeta[INPUT <: UnaryExpression] extends ExprMeta[INPUT]

    Base class for metadata around UnaryExpression.

  343. abstract class UnsafeRowToColumnarBatchIterator extends Iterator[ColumnarBatch]

    This class converts UnsafeRow instances to ColumnarBatches on the GPU through the magic of code generation.

    This class converts UnsafeRow instances to ColumnarBatches on the GPU through the magic of code generation. This just provides most of the framework a concrete implementation will be generated based off of the schema.

  344. trait WithTableBuffer extends AnyRef

    An interface for obtaining the device buffer backing a contiguous/packed table

  345. case class WrappedGpuMetric(sqlMetric: SQLMetric) extends GpuMetric with Product with Serializable

Value Members

  1. object AggregateModeInfo extends Serializable
  2. object AggregateUtils
  3. object AutoCloseColumnBatchIterator
  4. object CaseWhenCheck extends ExprChecks

    This is specific to CaseWhen, because it does not follow the typical parameter convention.

  5. object CoalesceGoal
  6. object ColumnCastUtil extends Arm

    This class casts a column to another column if the predicate passed resolves to true.

    This class casts a column to another column if the predicate passed resolves to true. This method should be able to handle nested or non-nested types

    At this time this is strictly a place for casting methods

  7. object ColumnarPartitionReaderWithPartitionValues extends Arm
  8. object ColumnarRdd

    This provides a way to get back out GPU Columnar data RDD[Table].

    This provides a way to get back out GPU Columnar data RDD[Table]. Each Table will have the same schema as the dataframe passed in. If the schema of the dataframe is something that Rapids does not currently support an IllegalArgumentException will be thrown.

    The size of each table will be determined by what is producing that table but typically will be about the number of bytes set by RapidsConf.GPU_BATCH_SIZE_BYTES.

    Table is not a typical thing in an RDD so special care needs to be taken when working with it. By default it is not serializable so repartitioning the RDD or any other operator that involves a shuffle will not work. This is because it is very expensive to serialize and deserialize a GPU Table using a conventional spark shuffle. Also most of the memory associated with the Table is on the GPU itself, so each table must be closed when it is no longer needed to avoid running out of GPU memory. By convention it is the responsibility of the one consuming the data to close it when they no longer need it.

  9. object ConcatAndConsumeAll

    Consumes an Iterator of ColumnarBatches and concatenates them into a single ColumnarBatch.

    Consumes an Iterator of ColumnarBatches and concatenates them into a single ColumnarBatch. The batches will be closed when this operation is done.

  10. object ConfHelper
  11. object CreateNamedStructCheck extends ExprChecks

    A check for CreateNamedStruct.

    A check for CreateNamedStruct. The parameter values alternate between one type and another. If this pattern shows up again we can make this more generic at that point.

  12. object CsvFormatType extends FileFormatType
  13. object CudfRowTransitions
  14. object DataTypeMeta
  15. object DateUtils

    Class for helper functions for Date

  16. object DecimalUtil
  17. object DenseRankFixer extends Arm
  18. object ExecChecks

    gives users an API to create ExecChecks.

  19. object ExecutionPlanCaptureCallback
  20. object Explain
  21. object ExprChecks
  22. object ExpressionContext
  23. object FileFormatChecks
  24. object FileUtils
  25. object FloatUtils extends Arm
  26. object FullSortSingleBatch extends SortExecType
  27. object GeneratedUnsafeRowToCudfRowIterator extends Logging
  28. object GpuBatchUtils

    Utility class with methods for calculating various metrics about GPU memory usage prior to allocation.

  29. object GpuBindReferences extends Logging
  30. object GpuBuildLeft extends GpuBuildSide with Product with Serializable
  31. object GpuBuildRight extends GpuBuildSide with Product with Serializable
  32. object GpuCSVScan extends Serializable
  33. object GpuCanonicalize

    Rewrites an expression using rules that are guaranteed preserve the result while attempting to remove cosmetic variations.

    Rewrites an expression using rules that are guaranteed preserve the result while attempting to remove cosmetic variations. Deterministic expressions that are equal after canonicalization will always return the same answer given the same input (i.e. false positives should not be possible). However, it is possible that two canonical expressions that are not equal will in fact return the same answer given any input (i.e. false negatives are possible).

    The following rules are applied:

    • Names and nullability hints for org.apache.spark.sql.types.DataTypes are stripped.
    • Names for GetStructField are stripped.
    • TimeZoneId for Cast and AnsiCast are stripped if needsTimeZone is false.
    • Commutative and associative operations (Add and Multiply) have their children ordered by hashCode.
    • EqualTo and EqualNullSafe are reordered by hashCode.
    • Other comparisons (GreaterThan, LessThan) are reversed by hashCode.
    • Elements in In are reordered by hashCode.

    This is essentially a copy of the Spark Canonicalize class but updated for GPU operators

  34. object GpuCast extends Arm with Serializable
  35. object GpuCoalesceExec extends Serializable
  36. object GpuColumnarToRowExecParent extends Serializable
  37. object GpuDeviceManager extends Logging
  38. object GpuExec extends Serializable
  39. object GpuExpressionsUtils extends Arm
  40. object GpuFilter extends Arm

    Run a filter on a batch.

    Run a filter on a batch. The batch will be consumed.

  41. object GpuKeyBatchingIterator
  42. object GpuLiteral extends Serializable
  43. object GpuMetric extends Logging with Serializable
  44. object GpuNvl extends Arm
  45. object GpuOrcScanBase
  46. object GpuOverrides extends Serializable
  47. object GpuParquetFileFormat
  48. object GpuParquetPartitionReaderFactoryBase

    Base object that has common functions for both GpuParquetPartitionReaderFactory and GpuParquetPartitionReaderFactory

  49. object GpuParquetScanBase
  50. object GpuProjectExec extends Arm with Serializable
  51. object GpuRangePartitioner extends Serializable
  52. object GpuReadCSVFileFormat
  53. object GpuReadOrcFileFormat extends Serializable
  54. object GpuReadParquetFileFormat extends Serializable
  55. object GpuRowNumber extends Expression with GpuRunningWindowFunction with GpuBatchedRunningWindowWithFixer with Product with Serializable

    The row number in the window.

    The row number in the window.

    Note

    this is a running window only operator

  56. object GpuRunningWindowIterator extends Arm
  57. object GpuScalar extends Arm with Logging
  58. object GpuSemaphore
  59. object GpuSinglePartitioning extends Expression with GpuExpression with GpuPartitioning with Product with Serializable
  60. object GpuTopN extends Arm with Serializable
  61. object GpuTransitionOverrides
  62. object GpuUnspecifiedFrame extends Expression with GpuWindowFrame with Product with Serializable
  63. object GpuUserDefinedFunction extends Serializable
  64. object GpuWindowExec extends Arm with Serializable
  65. object GroupByAggExprContext extends ExpressionContext
  66. object GroupedAggregations extends Arm
  67. object HostColumnarToGpu extends Logging with Serializable
  68. object JoinGatherer extends Arm
  69. object JoinGathererImpl
  70. object LambdaExprContext extends ExpressionContext
  71. object LazySpillableColumnarBatch
  72. object LazySpillableGatherMap
  73. object MemoryCostHelper
  74. object MetaUtils extends Arm
  75. object MetricsLevel extends Serializable
  76. object MultiFileThreadPoolUtil
  77. object NoopMetric extends GpuMetric
  78. object NotApplicable extends SupportLevel

    N/A neither spark nor the plugin supports this.

  79. object NotSupported extends SupportLevel

    Spark supports this but the plugin does not.

  80. object NvcompLZ4CompressionCodec extends Arm
  81. object NvtxWithMetrics
  82. object OrcFormatType extends FileFormatType
  83. object OrcMultiFileThreadPoolFactory
  84. object OutOfCoreSort extends SortExecType
  85. object ParquetFormatType extends FileFormatType
  86. object ParquetMultiFileThreadPoolFactory
  87. object ParquetPartitionReader
  88. object PartChecks
  89. object PartitionReaderIterator
  90. object PlanUtils
  91. object ProjectExprContext extends ExpressionContext
  92. object RankFixer extends Arm
  93. object RapidsBuffer
  94. object RapidsBufferCatalog extends Logging with Arm
  95. object RapidsBufferStore
  96. object RapidsConf
  97. object RapidsExecutorPlugin
  98. object RapidsGdsStore
  99. object RapidsPluginImplicits

    RapidsPluginImplicits, adds implicit functions for ColumnarBatch, Seq, Seq[AutoCloseable], and Array[AutoCloseable] that help make resource management easier within the project.

  100. object RapidsPluginUtils extends Logging
  101. object ReadFileOp extends FileFormatOp
  102. object ReductionAggExprContext extends ExpressionContext
  103. object RequireSingleBatch extends CoalesceSizeGoal with Product with Serializable

    A single batch is required as the input to a note in the SparkPlan.

    A single batch is required as the input to a note in the SparkPlan. This means all of the data for a given task is in a single batch. This should be avoided as much as possible because it can result in running out of memory or run into limitations of the batch size by both Spark and cudf.

  104. object RowCountPlanVisitor

    Estimate the number of rows that an operator will output.

    Estimate the number of rows that an operator will output. Note that these row counts are the aggregate across all output partitions.

    Logic is based on Spark's SizeInBytesOnlyStatsPlanVisitor. which operates on logical plans and only computes data sizes, not row counts.

  105. object SamplingUtils extends Arm
  106. object SerializedTableColumn
  107. object ShimLoader extends Logging
  108. object ShuffleBufferCatalog
  109. object ShuffleMetadata extends Logging
  110. object ShuffleReceivedBufferCatalog
  111. object SortEachBatch extends SortExecType
  112. object SortUtils extends Arm
  113. object SpillPriorities

    Utility methods for managing spillable buffer priorities.

    Utility methods for managing spillable buffer priorities. The spill priority numerical space is divided into potentially overlapping ranges based on the type of buffer.

  114. object SpillableBuffer extends Arm
  115. object SpillableColumnarBatch extends Arm
  116. object StorageTier extends Enumeration

    Enumeration of the storage tiers

  117. object SupportedOpsDocs

    Used for generating the support docs.

  118. object SupportedOpsForTools
  119. object TableCompressionCodec
  120. object TypeEnum extends Enumeration

    The Supported Types.

    The Supported Types. The TypeSig API should be preferred for this, except in a few cases when TypeSig asks for a TypeEnum.

  121. object TypeSig
  122. object WindowAggExprContext extends ExpressionContext
  123. object WindowSpecCheck extends ExprChecks

    This is specific to WidowSpec, because it does not follow the typical parameter convention.

  124. object WriteFileOp extends FileFormatOp

Ungrouped