Packages

package rapids

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. trait AdaptiveSparkPlanHelperShim extends AnyRef
  2. case class AvroBatchContext(origChunkedBlocks: LinkedHashMap[Path, ArrayBuffer[DataBlockBase]], schema: SchemaBase, mergedHeader: Header) extends BatchContext with Product with Serializable
  3. case class AvroBlockMeta(header: Header, headerSize: Long, blocks: Seq[BlockInfo]) extends Product with Serializable

    Avro block meta info

    Avro block meta info

    header

    the header of avro file

    blocks

    the total block info of avro file

  4. case class AvroDataBlock(blockInfo: BlockInfo) extends DataBlockBase with Product with Serializable

    avro BlockInfo wrapper

  5. case class AvroExtraInfo() extends ExtraInfo with Product with Serializable

    Extra information

  6. case class AvroFileFilterHandler(hadoopConf: Configuration, options: AvroOptions) extends Logging with Product with Serializable

    A tool to filter Avro blocks

  7. class AvroProviderImpl extends AvroProvider
  8. case class AvroSchemaWrapper(schema: Schema) extends SchemaBase with Product with Serializable

    avro schema wrapper

  9. case class AvroSingleDataBlockInfo(filePath: Path, dataBlock: AvroDataBlock, partitionValues: InternalRow, schema: AvroSchemaWrapper, readSchema: StructType, extraInfo: AvroExtraInfo) extends SingleDataBlockInfo with Product with Serializable
  10. trait BasePad extends TernaryExpression with GpuTernaryExpressionArgsAnyScalarScalar with ImplicitCastInputTypes with NullIntolerant
  11. class BasicColumnarWriteJobStatsTracker extends ColumnarWriteJobStatsTracker

    Simple ColumnarWriteJobStatsTracker implementation that's serializable, capable of instantiating BasicColumnarWriteTaskStatsTracker on executors and processing the BasicColumnarWriteTaskStats they produce by aggregating the metrics and posting them as DriverMetricUpdates.

  12. case class BasicColumnarWriteTaskStats(numPartitions: Int, numFiles: Int, numBytes: Long, numRows: Long) extends WriteTaskStats with Product with Serializable

    Simple metrics collected during an instance of GpuFileFormatDataWriter.

    Simple metrics collected during an instance of GpuFileFormatDataWriter. These were first introduced in https://github.com/apache/spark/pull/18159 (SPARK-20703).

  13. class BasicColumnarWriteTaskStatsTracker extends ColumnarWriteTaskStatsTracker with Logging

    Simple metrics collected during an instance of GpuFileFormatDataWriter.

    Simple metrics collected during an instance of GpuFileFormatDataWriter. This is the columnar version of org.apache.spark.sql.execution.datasources.BasicWriteTaskStatsTracker.

  14. trait ColumnarWriteJobStatsTracker extends Serializable

    A class implementing this trait is basically a collection of parameters that are necessary for instantiating a (derived type of) ColumnarWriteTaskStatsTracker on all executors and then process the statistics produced by them (e.g.

    A class implementing this trait is basically a collection of parameters that are necessary for instantiating a (derived type of) ColumnarWriteTaskStatsTracker on all executors and then process the statistics produced by them (e.g. save them to memory/disk, issue warnings, etc). It is therefore important that such an objects is Serializable, as it will be sent from the driver to all executors.

  15. trait ColumnarWriteTaskStatsTracker extends AnyRef

    A trait for classes that are capable of collecting statistics on columnar data that's being processed by a single write task in GpuFileFormatDataWriter - i.e.

    A trait for classes that are capable of collecting statistics on columnar data that's being processed by a single write task in GpuFileFormatDataWriter - i.e. there should be one instance per executor.

    newPartition event is only triggered if the relation to be written out is partitioned.

  16. trait CpuToGpuAggregateBufferConverter extends AnyRef
  17. trait CpuToGpuBufferTransition extends UnaryExpression with ShimUnaryExpression with CodegenFallback
  18. class CpuToGpuCollectBufferConverter extends CpuToGpuAggregateBufferConverter
  19. case class CpuToGpuCollectBufferTransition(child: Expression, elementType: DataType) extends UnaryExpression with CpuToGpuBufferTransition with Product with Serializable
  20. trait CudfAggregate extends Serializable
  21. abstract class CudfBinaryArithmetic extends CudfBinaryOperator with NullIntolerant
  22. abstract class CudfBinaryComparison extends CudfBinaryOperator with Predicate
  23. abstract class CudfBinaryMathExpression extends BinaryExpression with CudfBinaryExpression with Serializable with ImplicitCastInputTypes
  24. abstract class CudfBinaryPredicateWithSideEffect extends CudfBinaryOperator with Predicate
  25. class CudfCollectList extends CudfAggregate
  26. class CudfCollectSet extends CudfAggregate

    Spark handles NaN's equality by different way for non-nested float/double and float/double in nested types.

    Spark handles NaN's equality by different way for non-nested float/double and float/double in nested types. When we use non-nested versions of floats and doubles, NaN values are considered unequal, but when we collect sets of nested versions, NaNs are considered equal on the CPU. So we set NaNEquality dynamically in CudfCollectSet and CudfMergeSets. Note that dataType is ArrayType(child.dataType) here.

  27. class CudfCount extends CudfAggregate
  28. class CudfM2 extends CudfAggregate
  29. class CudfMax extends CudfAggregate
  30. class CudfMean extends CudfAggregate

    This class is only used by the M2 class aggregates, do not confuse this with GpuAverage.

    This class is only used by the M2 class aggregates, do not confuse this with GpuAverage. In the future, this aggregate class should be removed and the mean values should be generated in the output of libcudf's M2 aggregate.

  31. class CudfMergeLists extends CudfAggregate
  32. class CudfMergeM2 extends CudfAggregate
  33. class CudfMergeSets extends CudfAggregate
  34. class CudfMin extends CudfAggregate
  35. class CudfNthLikeAggregate extends CudfAggregate
  36. class CudfSum extends CudfAggregate
  37. abstract class CudfUnaryMathExpression extends GpuUnaryMathExpression with CudfUnaryExpression
  38. class ExecutionPlanCaptureCallback extends QueryExecutionListener

    Used as a part of testing to capture the executed query plan.

  39. trait ExecutionPlanCaptureCallbackBase extends AnyRef
  40. class FromUTCTimestampExprMeta extends BinaryExprMeta[FromUTCTimestamp]
  41. case class GpuAbs(child: Expression, failOnError: Boolean) extends GpuUnaryExpression with CudfUnaryExpression with ExpectsInputTypes with NullIntolerant with Product with Serializable
  42. case class GpuAcos(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  43. case class GpuAcoshCompat(child: Expression) extends GpuUnaryMathExpression with Product with Serializable
  44. case class GpuAcoshImproved(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  45. case class GpuAdd(left: Expression, right: Expression, failOnError: Boolean) extends GpuAddBase with Product with Serializable
  46. abstract class GpuAddBase extends CudfBinaryArithmetic with Serializable
  47. case class GpuAggregateExpression(origAggregateFunction: GpuAggregateFunction, mode: AggregateMode, isDistinct: Boolean, filter: Option[Expression], resultId: ExprId) extends Expression with GpuExpression with ShimExpression with GpuUnevaluable with Product with Serializable
  48. trait GpuAggregateFunction extends Expression with GpuExpression with ShimExpression with GpuUnevaluable

    Trait that all aggregate functions implement.

    Trait that all aggregate functions implement.

    Aggregates start with some input from the child plan or from another aggregate (or from itself if the aggregate is merging several batches).

    In general terms an aggregate function can be in one of two modes of operation: update or merge. Either the function is aggregating raw input, or it is merging previously aggregated data. Normally, Spark breaks up the processing of the aggregate in two exec nodes (a partial aggregate and a final), and the are separated by a shuffle boundary. That is not true for all aggregates, especially when looking at other flavors of Spark. What doesn't change is the core function of updating or merging. Note that an aggregate can merge right after an update is performed, as we have cases where input batches are update-aggregated and then a bigger batch is built by merging together those pre-aggregated inputs.

    Aggregates have an interface to Spark and that is defined by aggBufferAttributes. This collection of attributes must match the Spark equivalent of the aggregate, so that if half of the aggregate (update or merge) executes on the CPU, we can be compatible. The GpuAggregateFunction adds special steps to ensure that it can produce (and consume) batches in the shape of aggBufferAttributes.

    The general transitions that are implemented in the aggregate function are as follows:

    1) inputProjection -> updateAggregates: inputProjection creates a sequence of values that are operated on by the updateAggregates. The length of inputProjection must be the same as updateAggregates, and updateAggregates (cuDF aggregates) should be able to work with the product of the inputProjection (i.e. types are compatible)

    2) updateAggregates -> postUpdate: after the cuDF update aggregate, a post process step can (optionally) be performed. The postUpdate takes the output of updateAggregate that must match the order of columns and types as specified in aggBufferAttributes.

    3) postUpdate -> preMerge: preMerge prepares batches before going into the mergeAggregate. The preMerge step binds to aggBufferAttributes, so it can be used to transform Spark compatible batch to a batch that the cuDF merge aggregate expects. Its input has the same shape as that produced by postUpdate.

    4) mergeAggregates->postMerge: postMerge optionally transforms the output of the cuDF merge aggregate in two situations: 1 - The step is used to match the aggBufferAttributes references for partial aggregates where each partially aggregated batch is getting merged with AggHelper(merge=true) 2 - In a final aggregate where the merged batches are transformed to what evaluateExpression expects. For simple aggregates like sum or count, evaluateExpression is just aggBufferAttributes, but for more complex aggregates, it is an expression (see GpuAverage and GpuM2 subclasses) that relies on the merge step producing a columns in the shape of aggBufferAttributes.

  49. case class GpuAnd(left: Expression, right: Expression) extends CudfBinaryPredicateWithSideEffect with Product with Serializable
  50. trait GpuArrayBinaryLike extends Expression with GpuComplexTypeMergingExpression with NullIntolerant
  51. case class GpuArrayContains(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpression with NullIntolerant with Product with Serializable

    Checks if the array (left) has the element (right)

  52. case class GpuArrayExcept(left: Expression, right: Expression) extends Expression with GpuArrayBinaryLike with ExpectsInputTypes with Product with Serializable
  53. case class GpuArrayIntersect(left: Expression, right: Expression) extends Expression with GpuArrayBinaryLike with ExpectsInputTypes with Product with Serializable
  54. abstract class GpuArrayMax extends GpuUnaryExpression with ImplicitCastInputTypes with Serializable
  55. abstract class GpuArrayMin extends GpuUnaryExpression with ImplicitCastInputTypes with Serializable
  56. case class GpuArrayRemove(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpression with Product with Serializable
  57. case class GpuArrayRepeat(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpression with Product with Serializable
  58. case class GpuArrayUnion(left: Expression, right: Expression) extends Expression with GpuArrayBinaryLike with ExpectsInputTypes with Product with Serializable
  59. case class GpuArraysOverlap(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpression with ExpectsInputTypes with NullIntolerant with Product with Serializable
  60. case class GpuArraysZip(children: Seq[Expression]) extends Expression with GpuExpression with ShimExpression with ExpectsInputTypes with Product with Serializable
  61. case class GpuAsin(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  62. case class GpuAsinhCompat(child: Expression) extends GpuUnaryMathExpression with Product with Serializable
  63. case class GpuAsinhImproved(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  64. case class GpuAssembleSumChunks(chunkAttrs: Seq[AttributeReference], dataType: DecimalType, nullOnOverflow: Boolean) extends Expression with GpuExpression with ShimExpression with Product with Serializable

    Reassembles a 128-bit value from four separate 64-bit sum results

    Reassembles a 128-bit value from four separate 64-bit sum results

    chunkAttrs

    attributes for the four 64-bit sum chunks ordered from least significant to most significant

    dataType

    output type of the reconstructed 128-bit value

    nullOnOverflow

    whether to produce null on overflows

  65. case class GpuAtan(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  66. case class GpuAtanh(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  67. abstract class GpuAverage extends Expression with GpuAggregateFunction with GpuReplaceWindowFunction with Serializable
  68. case class GpuAvroMultiFilePartitionReaderFactory(sqlConf: SQLConf, rapidsConf: RapidsConf, broadcastedConf: Broadcast[SerializableConfiguration], dataSchema: StructType, readDataSchema: StructType, partitionSchema: StructType, options: AvroOptions, metrics: Map[String, GpuMetric], filters: Array[Filter], queryUsesInputFile: Boolean) extends MultiFilePartitionReaderFactoryBase with Product with Serializable

    The multi-file partition reader factory for cloud or coalescing reading of avro file format.

  69. class GpuAvroPartitionReader extends FilePartitionReaderBase with GpuAvroReaderBase

    A PartitionReader that reads an AVRO file split on the GPU.

  70. case class GpuAvroPartitionReaderFactory(sqlConf: SQLConf, rapidsConf: RapidsConf, broadcastedConf: Broadcast[SerializableConfiguration], dataSchema: StructType, readDataSchema: StructType, partitionSchema: StructType, avroOptions: AvroOptions, metrics: Map[String, GpuMetric], params: Map[String, String]) extends ShimFilePartitionReaderFactory with Logging with Product with Serializable

    Avro partition reader factory to build columnar reader

  71. trait GpuAvroReaderBase extends Logging

    A trait collecting common methods across the 3 kinds of avro readers

  72. case class GpuAvroScan(sparkSession: SparkSession, fileIndex: PartitioningAwareFileIndex, dataSchema: StructType, readDataSchema: StructType, readPartitionSchema: StructType, options: CaseInsensitiveStringMap, pushedFilters: Array[Filter], rapidsConf: RapidsConf, partitionFilters: Seq[Expression] = Seq.empty, dataFilters: Seq[Expression] = Seq.empty, queryUsesInputFile: Boolean = false) extends FileScan with GpuScan with Product with Serializable
  73. case class GpuBRound(child: Expression, scale: Expression, outputType: DataType) extends GpuRoundBase with Product with Serializable
  74. case class GpuBasicArrayMax(child: Expression) extends GpuArrayMax with Product with Serializable

    ArrayMax without NaN handling

  75. case class GpuBasicArrayMin(child: Expression) extends GpuArrayMin with Product with Serializable

    ArrayMin without Nan handling

  76. case class GpuBasicAverage(child: Expression, dt: DataType) extends GpuAverage with Product with Serializable
  77. case class GpuBasicDecimalAverage(child: Expression, dt: DecimalType) extends GpuDecimalAverage with Product with Serializable
  78. case class GpuBasicDecimalSum(child: Expression, dt: DecimalType, failOnErrorOverride: Boolean) extends GpuDecimalSum with Product with Serializable

    Sum aggregations for decimals up to and including DECIMAL64

  79. case class GpuBasicMax(child: Expression) extends GpuMax with Product with Serializable

    Max aggregation without Nan handling

  80. case class GpuBasicMin(child: Expression) extends GpuMin with Product with Serializable

    Min aggregation without Nan handling

  81. case class GpuBasicSum(child: Expression, resultType: DataType, failOnErrorOverride: Boolean) extends GpuSum with Product with Serializable

    Sum aggregation for non-decimal types

  82. case class GpuBitLength(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  83. case class GpuBitwiseAnd(left: Expression, right: Expression) extends CudfBinaryArithmetic with Product with Serializable
  84. case class GpuBitwiseNot(child: Expression) extends GpuUnaryExpression with CudfUnaryExpression with ExpectsInputTypes with Product with Serializable
  85. case class GpuBitwiseOr(left: Expression, right: Expression) extends CudfBinaryArithmetic with Product with Serializable
  86. case class GpuBitwiseXor(left: Expression, right: Expression) extends CudfBinaryArithmetic with Product with Serializable
  87. class GpuCartesianPartition extends Partition
  88. case class GpuCartesianProductExec(left: SparkPlan, right: SparkPlan, condition: Option[Expression], targetSizeBytes: Long) extends SparkPlan with ShimBinaryExecNode with GpuExec with Product with Serializable
  89. class GpuCartesianRDD extends RDD[ColumnarBatch] with Serializable
  90. case class GpuCbrt(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  91. case class GpuCeil(child: Expression, outputType: DataType) extends CudfUnaryMathExpression with Product with Serializable
  92. case class GpuCheckOverflowAfterSum(data: Expression, isEmpty: Expression, dataType: DecimalType, nullOnOverflow: Boolean) extends Expression with GpuExpression with ShimExpression with Product with Serializable

    This is equivalent to what Spark does after a sum to check for overflow If(isEmpty, Literal.create(null, resultType), CheckOverflowInSum(sum, d, !SQLConf.get.ansiEnabled))

    This is equivalent to what Spark does after a sum to check for overflow If(isEmpty, Literal.create(null, resultType), CheckOverflowInSum(sum, d, !SQLConf.get.ansiEnabled))

    But we are renaming it to avoid confusion with the overflow detection we do as a part of sum itself that takes the place of the overflow checking that happens with add.

  93. trait GpuCollectBase extends Expression with GpuAggregateFunction with GpuDeterministicFirstLastCollectShim with GpuAggregateWindowFunction
  94. case class GpuCollectList(child: Expression, mutableAggBufferOffset: Int = 0, inputAggBufferOffset: Int = 0) extends Expression with GpuCollectBase with Product with Serializable

    Collects and returns a list of non-unique elements.

    Collects and returns a list of non-unique elements.

    The two 'offset' parameters are not used by GPU version, but are here for the compatibility with the CPU version and automated checks.

  95. case class GpuCollectSet(child: Expression, mutableAggBufferOffset: Int = 0, inputAggBufferOffset: Int = 0) extends Expression with GpuCollectBase with Product with Serializable

    Collects and returns a set of unique elements.

    Collects and returns a set of unique elements.

    The two 'offset' parameters are not used by GPU version, but are here for the compatibility with the CPU version and automated checks.

  96. case class GpuConcat(children: Seq[Expression]) extends Expression with GpuComplexTypeMergingExpression with Product with Serializable
  97. case class GpuConcatWs(children: Seq[Expression]) extends Expression with GpuExpression with ShimExpression with ImplicitCastInputTypes with Product with Serializable
  98. case class GpuContains(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpressionArgsAnyScalar with Predicate with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  99. case class GpuConv(num: Expression, fromBase: Expression, toBase: Expression) extends TernaryExpression with GpuTernaryExpression with Product with Serializable
  100. class GpuConvMeta extends TernaryExprMeta[Conv]
  101. case class GpuCos(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  102. case class GpuCosh(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  103. case class GpuCot(child: Expression) extends GpuUnaryMathExpression with Product with Serializable
  104. case class GpuCount(children: Seq[Expression], failOnError: Boolean = SQLConf.get.ansiEnabled) extends Expression with GpuAggregateFunction with GpuBatchedRunningWindowWithFixer with GpuUnboundToUnboundWindowWithFixer with GpuAggregateWindowFunction with GpuRunningWindowFunction with Product with Serializable
  105. case class GpuCreateArray(children: Seq[Expression], useStringTypeWhenEmpty: Boolean) extends Expression with GpuExpression with ShimExpression with Product with Serializable
  106. case class GpuCreateMap(children: Seq[Expression], useStringTypeWhenEmpty: Boolean) extends Expression with GpuExpression with ShimExpression with Product with Serializable
  107. case class GpuCreateNamedStruct(children: Seq[Expression]) extends Expression with GpuExpression with ShimExpression with Product with Serializable
  108. case class GpuDataSource(sparkSession: SparkSession, className: String, paths: Seq[String] = Nil, userSpecifiedSchema: Option[StructType] = None, partitionColumns: Seq[String] = Seq.empty, bucketSpec: Option[BucketSpec] = None, options: Map[String, String] = Map.empty, catalogTable: Option[CatalogTable] = None, origProvider: Class[_], gpuFileFormat: ColumnarFileFormat) extends GpuDataSourceBase with Product with Serializable
  109. abstract class GpuDataSourceBase extends Logging

    A truncated version of Spark DataSource that converts to use the GPU version of InsertIntoHadoopFsRelationCommand for FileFormats we support.

    A truncated version of Spark DataSource that converts to use the GPU version of InsertIntoHadoopFsRelationCommand for FileFormats we support. This does not support DataSource V2 writing at this point because at the time of copying, it did not.

  110. trait GpuDataSourceScanExec extends SparkPlan with ShimLeafExecNode with GpuExec

    GPU implementation of Spark's DataSourceScanExec

  111. case class GpuDateAdd(startDate: Expression, days: Expression) extends BinaryExpression with GpuDateMathBase with Product with Serializable
  112. case class GpuDateAddInterval(start: Expression, interval: Expression, timeZoneId: Option[String] = None, ansiEnabled: Boolean = SQLConf.get.ansiEnabled) extends GpuTimeMath with Product with Serializable
  113. case class GpuDateDiff(endDate: Expression, startDate: Expression) extends BinaryExpression with GpuBinaryExpression with ImplicitCastInputTypes with Product with Serializable
  114. case class GpuDateFormatClass(timestamp: Expression, format: Expression, strfFormat: String, timeZoneId: Option[String] = None) extends BinaryExpression with GpuBinaryExpressionArgsAnyScalar with TimeZoneAwareExpression with ImplicitCastInputTypes with Product with Serializable
  115. trait GpuDateMathBase extends BinaryExpression with GpuBinaryExpression with ExpectsInputTypes
  116. case class GpuDateSub(startDate: Expression, days: Expression) extends BinaryExpression with GpuDateMathBase with Product with Serializable
  117. trait GpuDateUnaryExpression extends GpuUnaryExpression with ImplicitCastInputTypes
  118. case class GpuDayOfMonth(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  119. case class GpuDayOfWeek(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  120. case class GpuDayOfYear(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  121. case class GpuDecimal128Average(child: Expression, dt: DecimalType) extends GpuDecimalAverage with Product with Serializable

    Average aggregations for DECIMAL128.

    Average aggregations for DECIMAL128.

    To avoid the significantly slower sort-based aggregations in cudf for DECIMAL128 columns, the incoming DECIMAL128 values are split into four 32-bit chunks which are summed separately into 64-bit intermediate results and then recombined into a 128-bit result with overflow checking. See GpuDecimal128Sum for more details.

  122. case class GpuDecimal128Sum(child: Expression, dt: DecimalType, failOnErrorOverride: Boolean, forceWindowSumToNotBeReplaced: Boolean) extends GpuDecimalSum with GpuReplaceWindowFunction with Product with Serializable

    Sum aggregations for DECIMAL128.

    Sum aggregations for DECIMAL128.

    The sum aggregation is performed by splitting the original 128-bit values into 32-bit "chunks" and summing those. The chunking accomplishes two things. First, it helps avoid cudf resorting to a much slower aggregation since currently DECIMAL128 sums are only implemented for sort-based aggregations. Second, chunking allows detection of overflows.

    The chunked approach to sum aggregation works as follows. The 128-bit value is split into its four 32-bit chunks, with the most significant chunk being an INT32 and the remaining three chunks being UINT32. When these are sum aggregated, cudf will implicitly upscale the accumulated result to a 64-bit value. Since cudf only allows up to 2**31 rows to be aggregated at a time, the "extra" upper 32-bits of the upscaled 64-bit accumulation values will be enough to hold the worst-case "carry" bits from summing each 32-bit chunk.

    After the cudf aggregation has completed, the four 64-bit chunks are reassembled into a 128-bit value. The lowest 32-bits of the least significant 64-bit chunk are used directly as the lowest 32-bits of the final value, and the remaining 32-bits are added to the next most significant 64-bit chunk. The lowest 32-bits of that chunk then become the next 32-bits of the 128-bit value and the remaining 32-bits are added to the next 64-bit chunk, and so on. Finally after the 128-bit value is constructed, the remaining "carry" bits of the most significant chunk after reconstruction are checked against the sign bit of the 128-bit result to see if there was an overflow.

  123. abstract class GpuDecimalAverage extends GpuDecimalAverageBase
  124. abstract class GpuDecimalAverageBase extends GpuAverage
  125. case class GpuDecimalDivide(left: Expression, right: Expression, dataType: DecimalType, failOnError: Boolean = SQLConf.get.ansiEnabled) extends Expression with ShimExpression with GpuDecimalDivideBase with Product with Serializable
  126. trait GpuDecimalDivideBase extends Expression with GpuExpression

    A version of Divide specifically for DecimalType that does not force the left and right to be the same type.

    A version of Divide specifically for DecimalType that does not force the left and right to be the same type. This lets us calculate the correct result on a wider range of values without the need for unbounded precision in the processing.

  127. case class GpuDecimalMultiply(left: Expression, right: Expression, dataType: DecimalType, useLongMultiply: Boolean = false, failOnError: Boolean = SQLConf.get.ansiEnabled) extends Expression with ShimExpression with GpuDecimalMultiplyBase with Product with Serializable
  128. trait GpuDecimalMultiplyBase extends Expression with GpuExpression
  129. abstract class GpuDecimalSum extends GpuSum
  130. case class GpuDecimalSumHighDigits(input: Expression, originalInputType: DecimalType) extends Expression with GpuExpression with ShimExpression with Product with Serializable

    This extracts the highest digits from a Decimal value as a part of doing a SUM.

  131. trait GpuDivModLike extends CudfBinaryArithmetic
  132. case class GpuDivide(left: Expression, right: Expression, failOnErrorOverride: Boolean = SQLConf.get.ansiEnabled) extends CudfBinaryArithmetic with GpuDivModLike with Product with Serializable
  133. class GpuDynamicPartitionDataConcurrentWriter extends GpuDynamicPartitionDataSingleWriter

    Dynamic partition writer with concurrent writers, meaning multiple concurrent writers are opened for writing.

    Dynamic partition writer with concurrent writers, meaning multiple concurrent writers are opened for writing.

    The process has the following steps:

    • Step 1: Maintain a map of output writers per each partition columns. Keep all writers opened; Cache the inputted batches by splitting them into sub-groups and each partition holds a list of spillable sub-groups; Find and write the max pending partition data if the total caches exceed the limitation.
    • Step 2: If number of concurrent writers exceeds limit, fall back to sort-based write (GpuDynamicPartitionDataSingleWriter), sort rest of batches on partition. Write batch by batch, and eagerly close the writer when finishing Caller is expected to call writeWithIterator() instead of write() to write records. Note: when fall back to GpuDynamicPartitionDataSingleWriter, the single writer should restore un-closed writers and should handle un-flushed spillable caches.
  134. class GpuDynamicPartitionDataSingleWriter extends GpuFileFormatDataWriter

    Dynamic partition writer with single writer, meaning only one writer is opened at any time for writing, meaning this single function can write to multiple directories (partitions) or files (bucketing).

    Dynamic partition writer with single writer, meaning only one writer is opened at any time for writing, meaning this single function can write to multiple directories (partitions) or files (bucketing). The data to be written are required to be sorted on partition and/or bucket column(s) before writing.

  135. case class GpuElementAt(left: Expression, right: Expression, failOnError: Boolean) extends BinaryExpression with GpuBinaryExpression with ExpectsInputTypes with Product with Serializable
  136. class GpuEmptyDirectoryDataWriter extends GpuFileFormatDataWriter

    GPU data writer for empty partitions

  137. case class GpuEndsWith(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpressionArgsAnyScalar with Predicate with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  138. case class GpuEqualNullSafe(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable
  139. case class GpuEqualTo(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable

    The table below shows how the result is calculated for Equal-to.

    The table below shows how the result is calculated for Equal-to. To make calculation easier we are leveraging the fact that the cudf-result(r) always returns false. So that result is used in place of false when needed.

    Return (lhs.nan && rhs.nan) || result[i]

    +-------------+------------+------------------+---------------+----+ | lhs.isNan()| rhs.isNan | cudf-result(r) | final-result | eq | +-------------+------------+------------------+---------------+----+ | t | f | f | r | f | | f | t | f | r | f | | t | t | f | t | t | | f | f | r | r | na | +-------------+------------+------------------+---------------+----+

  140. case class GpuEqualToNoNans(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable

    This implementation leverages the default implementation of equal-to on the GPU to perform the binary equals comparison.

    This implementation leverages the default implementation of equal-to on the GPU to perform the binary equals comparison. This is used for operations like PivotFirst, where NaN != NaN (unlike most other cases) when pivoting on a float or double column.

  141. case class GpuExp(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  142. case class GpuExpm1(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  143. case class GpuExtractChunk32(data: Expression, chunkIdx: Int, replaceNullsWithZero: Boolean) extends Expression with GpuExpression with ShimExpression with Product with Serializable

    Extracts a 32-bit chunk from a 128-bit value

    Extracts a 32-bit chunk from a 128-bit value

    data

    expression producing 128-bit values

    chunkIdx

    index of chunk to extract (0-3)

    replaceNullsWithZero

    whether to replace nulls with zero

  144. abstract class GpuFileFormatDataWriter extends DataWriter[ColumnarBatch]

    Abstract class for writing out data in a single Spark task using the GPU.

    Abstract class for writing out data in a single Spark task using the GPU. This is the GPU version of org.apache.spark.sql.execution.datasources.FileFormatDataWriter.

  145. case class GpuFileSourceScanExec(relation: HadoopFsRelation, originalOutput: Seq[Attribute], requiredSchema: StructType, partitionFilters: Seq[Expression], optionalBucketSet: Option[BitSet], optionalNumCoalescedBuckets: Option[Int], dataFilters: Seq[Expression], tableIdentifier: Option[TableIdentifier], disableBucketedScan: Boolean = false, queryUsesInputFile: Boolean = false, alluxioPathsMap: Option[Map[String, String]], requiredPartitionSchema: Option[StructType] = None)(rapidsConf: RapidsConf) extends SparkPlan with GpuDataSourceScanExec with GpuExec with Product with Serializable

    GPU version of Spark's FileSourceScanExec

    GPU version of Spark's FileSourceScanExec

    relation

    The file-based relation to scan.

    originalOutput

    Output attributes of the scan, including data attributes and partition attributes.

    requiredSchema

    Required schema of the underlying relation, excluding partition columns.

    partitionFilters

    Predicates to use for partition pruning.

    optionalBucketSet

    Bucket ids for bucket pruning.

    optionalNumCoalescedBuckets

    Number of coalesced buckets.

    dataFilters

    Filters on non-partition columns.

    tableIdentifier

    identifier for the table in the metastore.

    disableBucketedScan

    Disable bucketed scan based on physical query plan.

    queryUsesInputFile

    This is a parameter to easily allow turning it off in GpuTransitionOverrides if InputFileName, InputFileBlockStart, or InputFileBlockLength are used

    alluxioPathsMap

    Map containing mapping of DFS scheme to Alluxio scheme

    rapidsConf

    Rapids conf

  146. case class GpuFirst(child: Expression, ignoreNulls: Boolean) extends Expression with GpuAggregateFunction with GpuAggregateWindowFunction with GpuDeterministicFirstLastCollectShim with ImplicitCastInputTypes with Serializable with Product
  147. case class GpuFlattenArray(child: Expression) extends GpuUnaryExpression with NullIntolerant with Product with Serializable
  148. case class GpuFloatArrayMax(child: Expression) extends GpuArrayMax with Product with Serializable

    ArrayMax for FloatType and DoubleType to handle Nans.

    ArrayMax for FloatType and DoubleType to handle Nans.

    In Spark, Nan is the max float value, however in cuDF, the calculation involving Nan is undefined. We design a workaround method here to match the Spark's behaviour. The high level idea is that, we firstly check if each list contains Nan. If it is, the max value is Nan, else we use the cuDF kernel to calculate the max value.

  149. case class GpuFloatArrayMin(child: Expression) extends GpuArrayMin with Product with Serializable

    ArrayMin for FloatType and DoubleType to handle Nans.

    ArrayMin for FloatType and DoubleType to handle Nans.

    In Spark, Nan is the max float value, however in cuDF, the calculation involving Nan is undefined. We design a workaround method here to match the Spark's behaviour. The high level idea is: if one list contains only Nans or nulls then if the list contains Nan then return Nan else return null else replace all Nans with nulls; use cuDF kernel to find the min value

  150. case class GpuFloatMax(child: Expression) extends GpuMax with GpuReplaceWindowFunction with Product with Serializable

    Max aggregation for FloatType and DoubleType to handle Nans.

    Max aggregation for FloatType and DoubleType to handle Nans.

    In Spark, Nan is the max float value, however in cuDF, the calculation involving Nan is undefined. We design a workaround method here to match the Spark's behaviour. The high level idea is that, in the projection stage, we create another column isNan. If any value in this column is true, return Nan, Else, return what GpuBasicMax returns.

  151. case class GpuFloatMin(child: Expression) extends GpuMin with GpuReplaceWindowFunction with Product with Serializable

    GpuMin for FloatType and DoubleType to handle Nans.

    GpuMin for FloatType and DoubleType to handle Nans.

    In Spark, Nan is the max float value, however in cuDF, the calculation involving Nan is undefined. We design a workaround method here to match the Spark's behaviour. The high level idea is: if the column contains only Nans or nulls then if the column contains Nan then return Nan else return null else replace all Nans with nulls; use cuDF kernel to find the min value

  152. case class GpuFloor(child: Expression, outputType: DataType) extends CudfUnaryMathExpression with Product with Serializable
  153. case class GpuFormatNumber(x: Expression, d: Expression) extends BinaryExpression with GpuBinaryExpression with ExpectsInputTypes with NullIntolerant with Product with Serializable
  154. case class GpuFromUTCTimestamp(timestamp: Expression, timezone: Expression) extends BinaryExpression with GpuBinaryExpressionArgsAnyScalar with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  155. case class GpuFromUnixTime(sec: Expression, format: Expression, strfFormat: String, timeZoneId: Option[String] = None) extends BinaryExpression with GpuBinaryExpressionArgsAnyScalar with TimeZoneAwareExpression with ImplicitCastInputTypes with Product with Serializable
  156. case class GpuGetArrayItem(child: Expression, ordinal: Expression, failOnError: Boolean) extends BinaryExpression with GpuBinaryExpression with ExpectsInputTypes with ShimGetArrayItem with Product with Serializable

    Returns the field at ordinal in the Array child.

    Returns the field at ordinal in the Array child.

    We need to do type checking here as ordinal expression maybe unresolved.

  157. case class GpuGetArrayStructFields(child: Expression, field: StructField, ordinal: Int, numFields: Int, containsNull: Boolean) extends GpuUnaryExpression with ShimGetArrayStructFields with NullIntolerant with Product with Serializable

    For a child whose data type is an array of structs, extracts the ordinal-th fields of all array elements, and returns them as a new array.

    For a child whose data type is an array of structs, extracts the ordinal-th fields of all array elements, and returns them as a new array.

    No need to do type checking since it is handled by 'ExtractValue'.

  158. class GpuGetArrayStructFieldsMeta extends UnaryExprMeta[GetArrayStructFields]
  159. case class GpuGetMapValue(child: Expression, key: Expression, failOnError: Boolean) extends BinaryExpression with GpuBinaryExpression with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  160. case class GpuGetStructField(child: Expression, ordinal: Int, name: Option[String] = None) extends UnaryExpression with ShimUnaryExpression with GpuExpression with ShimGetStructField with NullIntolerant with Product with Serializable
  161. case class GpuGetTimestamp(strTs: Expression, format: Expression, sparkFormat: String, strf: String, timeZoneId: Option[String] = None) extends GpuToTimestamp with Product with Serializable
  162. case class GpuGreaterThan(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable

    The table below shows how the result is calculated for greater-than.

    The table below shows how the result is calculated for greater-than. To make calculation easier we are leveraging the fact that the cudf-result(r) always returns false. So that result is used in place of false when needed.

    In this case return (lhs.nan && !lhs.nan) || result[i]

    +-------------+------------+-----------------+---------------+----+ | lhs.isNan()| rhs.isNan | cudf-result(r) | final-result | gt | +-------------+------------+-----------------+---------------+----+ | t | f | f | t | t | | f | t | f | r | f | | t | t | f | r | f | | f | f | r | r | na | +-------------+------------+-----------------+---------------+----+

  163. case class GpuGreaterThanOrEqual(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable

    The table below shows how the result is calculated for Greater-than-Eq.

    The table below shows how the result is calculated for Greater-than-Eq. To make calculation easier we are leveraging the fact that the cudf-result(r) always returns false. So that result is used in place of false when needed.

    In this case return lhs.isNan || result[i]

    +-------------+------------+-----------------+---------------+-----+ | lhs.isNan()| rhs.isNan | cudf-result(r) | final-result | gte | +-------------+------------+-----------------+---------------+-----+ | t | f | f | t | t | | f | t | f | r | f | | t | t | f | t | t | | f | f | r | r | NA | +-------------+------------+-----------------+---------------+-----+

  164. case class GpuGreatest(children: Seq[Expression]) extends Expression with GpuGreatestLeastBase with Product with Serializable
  165. trait GpuGreatestLeastBase extends Expression with ComplexTypeMergingExpression with GpuExpression with ShimExpression
  166. abstract class GpuHashExpression extends Expression with GpuExpression with ShimExpression
  167. case class GpuHour(child: Expression, timeZoneId: Option[String] = None) extends GpuUnaryExpression with GpuTimeUnaryExpression with Product with Serializable
  168. case class GpuHypot(left: Expression, right: Expression) extends CudfBinaryMathExpression with Product with Serializable
  169. case class GpuInMemoryTableScanExec(attributes: Seq[Attribute], predicates: Seq[Expression], relation: InMemoryRelation) extends SparkPlan with ShimLeafExecNode with GpuExec with Product with Serializable
  170. case class GpuInitCap(child: Expression) extends GpuUnaryExpression with ImplicitCastInputTypes with Product with Serializable
  171. case class GpuInputFileBlockLength() extends GpuLeafExpression with Product with Serializable

    Returns the length of the block being read, or -1 if not available.

    Returns the length of the block being read, or -1 if not available. This is extra difficult because we cannot coalesce batches in between when this is used and the input file or else we could run into problems with returning the wrong thing.

  172. case class GpuInputFileBlockStart() extends GpuLeafExpression with Product with Serializable

    Returns the start offset of the block being read, or -1 if not available.

    Returns the start offset of the block being read, or -1 if not available. This is extra difficult because we cannot coalesce batches in between when this is used and the input file or else we could run into problems with returning the wrong thing.

  173. case class GpuInputFileName() extends GpuLeafExpression with Product with Serializable

    Returns the name of the file being read, or empty string if not available.

    Returns the name of the file being read, or empty string if not available. This is extra difficult because we cannot coalesce batches in between when this is used and the input file or else we could run into problems with returning the wrong thing.

  174. case class GpuInsertIntoHadoopFsRelationCommand(outputPath: Path, staticPartitions: TablePartitionSpec, ifPartitionNotExists: Boolean, partitionColumns: Seq[Attribute], bucketSpec: Option[BucketSpec], fileFormat: ColumnarFileFormat, options: Map[String, String], query: LogicalPlan, mode: SaveMode, catalogTable: Option[CatalogTable], fileIndex: Option[FileIndex], outputColumnNames: Seq[String], useStableSort: Boolean, concurrentWriterPartitionFlushSize: Long) extends LogicalPlan with GpuDataWritingCommand with Product with Serializable
  175. case class GpuIntegralDivide(left: Expression, right: Expression) extends GpuIntegralDivideParent with Product with Serializable
  176. abstract class GpuIntegralDivideParent extends CudfBinaryArithmetic with GpuDivModLike with Serializable
  177. case class GpuJsonToStructs(schema: DataType, options: Map[String, String], child: Expression, timeZoneId: Option[String] = None) extends GpuUnaryExpression with TimeZoneAwareExpression with ExpectsInputTypes with NullIntolerant with Product with Serializable
  178. case class GpuLast(child: Expression, ignoreNulls: Boolean) extends Expression with GpuAggregateFunction with GpuAggregateWindowFunction with GpuDeterministicFirstLastCollectShim with ImplicitCastInputTypes with Serializable with Product
  179. case class GpuLastDay(startDate: Expression) extends GpuUnaryExpression with ImplicitCastInputTypes with Product with Serializable
  180. case class GpuLeast(children: Seq[Expression]) extends Expression with GpuGreatestLeastBase with Product with Serializable
  181. case class GpuLength(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  182. case class GpuLessThan(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable

    The table below shows how the result is calculated for Less-than.

    The table below shows how the result is calculated for Less-than. To make calculation easier we are leveraging the fact that the cudf-result(r) always returns false. So that result is used in place of false when needed.

    In this case return !lhs.nan && rhs.nan || result[i]

    +-------------+------------+-----------------+---------------+-----+ | lhs.isNan()| rhs.isNan | cudf-result(r) | final-result | lt | +-------------+------------+-----------------+---------------+-----+ | t | f | f | r | f | | f | t | f | t | t | | t | t | f | r | f | | f | f | r | r | NA | +-------------+------------+-----------------+---------------+-----+

  183. case class GpuLessThanOrEqual(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable

    The table below shows how the result is calculated for Less-than-Eq.

    The table below shows how the result is calculated for Less-than-Eq. To make calculation easier we are leveraging the fact that the cudf-result(r) always returns false. So that result is used in place of false when needed.

    In this case, return rhs.nan || result[i]

    +-------------+------------+------------------+---------------+-----+ | lhs.isNan()| rhs.isNan | cudf-result(r) | final-result | lte | +-------------+------------+------------------+---------------+-----+ | t | f | f | r | f | | f | t | f | t | t | | t | t | f | t | t | | f | f | r | r | NA | +-------------+------------+------------------+---------------+-----+

  184. case class GpuLike(left: Expression, right: Expression, escapeChar: Char) extends BinaryExpression with GpuBinaryExpressionArgsAnyScalar with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  185. case class GpuLog(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  186. case class GpuLogarithm(left: Expression, right: Expression) extends CudfBinaryMathExpression with Product with Serializable
  187. case class GpuLower(child: Expression) extends GpuUnaryString2StringExpression with Product with Serializable
  188. abstract class GpuM2 extends Expression with GpuAggregateFunction with ImplicitCastInputTypes with Serializable

    Base class for overriding standard deviation and variance aggregations.

    Base class for overriding standard deviation and variance aggregations. This is also a GPU-based implementation of 'CentralMomentAgg' aggregation class in Spark with the fixed 'momentOrder' variable set to '2'.

  189. case class GpuMapConcat(children: Seq[Expression]) extends Expression with GpuComplexTypeMergingExpression with Product with Serializable
  190. case class GpuMapEntries(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  191. case class GpuMapKeys(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  192. case class GpuMapValues(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  193. abstract class GpuMax extends Expression with GpuAggregateFunction with GpuBatchedRunningWindowWithFixer with GpuUnboundToUnboundWindowWithFixer with GpuAggregateWindowFunction with GpuRunningWindowFunction with Serializable
  194. case class GpuMd5(child: Expression) extends GpuUnaryExpression with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  195. case class GpuMicrosToTimestamp(child: Expression) extends GpuUnaryExpression with GpuNumberToTimestampUnaryExpression with Product with Serializable
  196. case class GpuMillisToTimestamp(child: Expression) extends GpuUnaryExpression with GpuNumberToTimestampUnaryExpression with Product with Serializable
  197. abstract class GpuMin extends Expression with GpuAggregateFunction with GpuBatchedRunningWindowWithFixer with GpuUnboundToUnboundWindowWithFixer with GpuAggregateWindowFunction with GpuRunningWindowFunction with Serializable
  198. case class GpuMinute(child: Expression, timeZoneId: Option[String] = None) extends GpuUnaryExpression with GpuTimeUnaryExpression with Product with Serializable
  199. case class GpuMonth(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  200. class GpuMultiFileAvroPartitionReader extends MultiFileCoalescingPartitionReaderBase with GpuAvroReaderBase

    A PartitionReader that can read multiple AVRO files up to the certain size.

    A PartitionReader that can read multiple AVRO files up to the certain size. It will coalesce small files together and copy the block data in a separate thread pool to speed up processing the small files before sending down to the GPU.

  201. class GpuMultiFileCloudAvroPartitionReader extends MultiFileCloudPartitionReaderBase with MultiFileReaderFunctions with GpuAvroReaderBase

    A PartitionReader that can read multiple AVRO files in parallel.

    A PartitionReader that can read multiple AVRO files in parallel. This is most efficient running in a cloud environment where the I/O of reading is slow.

    When reading a file, it

    • seeks to the start position of the first block located in this partition.
    • next, parses the meta and sync, rewrites the meta and sync, and copies the data to a batch buffer per block, until reaching the last one of the current partition.
    • sends batches to GPU at last.
  202. case class GpuMultiply(left: Expression, right: Expression) extends CudfBinaryArithmetic with Product with Serializable
  203. case class GpuMurmur3Hash(children: Seq[Expression], seed: Int) extends GpuHashExpression with Product with Serializable
  204. case class GpuNormalizeNaNAndZero(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  205. case class GpuNot(child: Expression) extends GpuUnaryExpression with CudfUnaryExpression with Predicate with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  206. case class GpuNthValue(child: Expression, offset: Expression, ignoreNulls: Boolean) extends Expression with GpuAggregateWindowFunction with ImplicitCastInputTypes with Serializable with Product
  207. trait GpuNumberToTimestampUnaryExpression extends GpuUnaryExpression
  208. case class GpuOctetLength(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  209. case class GpuOr(left: Expression, right: Expression) extends CudfBinaryPredicateWithSideEffect with Product with Serializable
  210. class GpuOrcFileFormat extends ColumnarFileFormat with Logging
  211. class GpuOrcWriter extends ColumnarOutputWriter
  212. class GpuPartitionwiseSampledRDD extends PartitionwiseSampledRDD[ColumnarBatch, ColumnarBatch]
  213. case class GpuPivotFirst(pivotColumn: Expression, valueColumn: Expression, pivotColumnValues: Seq[Any]) extends Expression with GpuAggregateFunction with Product with Serializable
  214. case class GpuPmod(left: Expression, right: Expression) extends GpuPmodBase with Product with Serializable
  215. abstract class GpuPmodBase extends CudfBinaryArithmetic with GpuDivModLike with Serializable
  216. class GpuPoissonSampler extends PoissonSampler[ColumnarBatch]
  217. case class GpuPow(left: Expression, right: Expression) extends CudfBinaryMathExpression with Product with Serializable
  218. case class GpuPreciseTimestampConversion(child: Expression, fromType: DataType, toType: DataType) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable

    Expression used internally to convert the TimestampType to Long and back without losing precision, i.e.

    Expression used internally to convert the TimestampType to Long and back without losing precision, i.e. in microseconds. Used in time windowing.

  219. case class GpuQuarter(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  220. case class GpuRLike(left: Expression, right: Expression, pattern: String) extends BinaryExpression with GpuBinaryExpressionArgsAnyScalar with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  221. class GpuRLikeMeta extends BinaryExprMeta[RLike]
  222. case class GpuRaiseError(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  223. class GpuReadAvroFileFormat extends AvroFileFormat with GpuReadFileFormatWithMetrics

    A FileFormat that allows reading Avro files with the GPU.

  224. case class GpuRegExpExtract(subject: Expression, regexp: Expression, idx: Expression)(cudfRegexPattern: String) extends GpuRegExpTernaryBase with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  225. case class GpuRegExpExtractAll(str: Expression, regexp: Expression, idx: Expression)(cudfRegexPattern: String) extends GpuRegExpTernaryBase with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  226. class GpuRegExpExtractAllMeta extends TernaryExprMeta[RegExpExtractAll]
  227. class GpuRegExpExtractMeta extends TernaryExprMeta[RegExpExtract]
  228. case class GpuRegExpReplace(srcExpr: Expression, searchExpr: Expression, replaceExpr: Expression)(javaRegexpPattern: String, cudfRegexPattern: String, cudfReplacementString: String, searchList: Option[Seq[String]], replaceOpt: Option[GpuRegExpReplaceOpt]) extends GpuRegExpTernaryBase with ImplicitCastInputTypes with HasGpuStringReplace with Product with Serializable
  229. case class GpuRegExpReplaceWithBackref(child: Expression, searchExpr: Expression, replaceExpr: Expression)(javaRegexpPattern: String, cudfRegexPattern: String, cudfReplacementString: String) extends GpuUnaryExpression with ImplicitCastInputTypes with Product with Serializable
  230. abstract class GpuRegExpTernaryBase extends TernaryExpression with GpuTernaryExpressionArgsAnyScalarScalar
  231. case class GpuRemainder(left: Expression, right: Expression) extends GpuRemainderBase with Product with Serializable
  232. abstract class GpuRemainderBase extends CudfBinaryArithmetic with GpuDivModLike with Serializable
  233. case class GpuReverse(child: Expression) extends GpuUnaryExpression with Product with Serializable
  234. case class GpuRint(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  235. case class GpuRound(child: Expression, scale: Expression, outputType: DataType) extends GpuRoundBase with Product with Serializable
  236. abstract class GpuRoundBase extends BinaryExpression with GpuBinaryExpressionArgsAnyScalar with Serializable with ImplicitCastInputTypes
  237. case class GpuRowBasedScalaUDF(sparkFunc: AnyRef, dataType: DataType, children: Seq[Expression], inputEncoders: Seq[Option[ExpressionEncoder[_]]], outputEncoder: Option[ExpressionEncoder[_]], udfName: Option[String], nullable: Boolean, udfDeterministic: Boolean) extends Expression with GpuRowBasedUserDefinedFunction with Product with Serializable
  238. case class GpuScalaUDF(function: RapidsUDF, dataType: DataType, children: Seq[Expression], udfName: Option[String], nullable: Boolean, udfDeterministic: Boolean) extends Expression with GpuUserDefinedFunction with Product with Serializable
  239. case class GpuScalarSubquery(plan: BaseSubqueryExec, exprId: ExprId) extends ExecSubqueryExpression with GpuExpression with ShimExpression with Product with Serializable

    GPU placeholder of ScalarSubquery, which returns the scalar result with columnarEval method.

    GPU placeholder of ScalarSubquery, which returns the scalar result with columnarEval method. This placeholder is to make ScalarSubquery working as a GPUExpression to cooperate other GPU overrides.

  240. case class GpuSecond(child: Expression, timeZoneId: Option[String] = None) extends GpuUnaryExpression with GpuTimeUnaryExpression with Product with Serializable
  241. case class GpuSecondsToTimestamp(child: Expression) extends GpuUnaryExpression with GpuNumberToTimestampUnaryExpression with Product with Serializable
  242. case class GpuSequence(start: Expression, stop: Expression, stepOpt: Option[Expression], timeZoneId: Option[String] = None) extends Expression with TimeZoneAwareExpression with GpuExpression with ShimExpression with Product with Serializable
  243. class GpuSequenceMeta extends ExprMeta[Sequence]
  244. class GpuSerializableBatch extends Serializable with AutoCloseable
    Annotations
    @SerialVersionUID()
  245. trait GpuShiftBase extends BinaryExpression with GpuBinaryExpression with ImplicitCastInputTypes
  246. case class GpuShiftLeft(left: Expression, right: Expression) extends BinaryExpression with GpuShiftBase with Product with Serializable
  247. case class GpuShiftRight(left: Expression, right: Expression) extends BinaryExpression with GpuShiftBase with Product with Serializable
  248. case class GpuShiftRightUnsigned(left: Expression, right: Expression) extends BinaryExpression with GpuShiftBase with Product with Serializable
  249. abstract class GpuShuffleBlockResolverBase extends ShuffleBlockResolver with Logging
  250. class GpuShuffleDependency[K, V, C] extends ShuffleDependency[K, V, C]
  251. class GpuShuffleEnv extends Logging
  252. class GpuShuffleHandle[K, V] extends BaseShuffleHandle[K, V, V]
  253. case class GpuSignum(child: Expression) extends GpuUnaryMathExpression with Product with Serializable
  254. case class GpuSin(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  255. class GpuSingleDirectoryDataWriter extends GpuFileFormatDataWriter

    Writes data to a single directory (used for non-dynamic-partition writes).

  256. case class GpuSinh(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  257. case class GpuSize(child: Expression, legacySizeOfNull: Boolean) extends GpuUnaryExpression with Product with Serializable
  258. case class GpuSortArray(base: Expression, ascendingOrder: Expression) extends BinaryExpression with GpuBinaryExpressionArgsAnyScalar with ExpectsInputTypes with Product with Serializable
  259. case class GpuSqrt(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  260. case class GpuStartsWith(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpressionArgsAnyScalar with Predicate with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  261. case class GpuStddevPop(child: Expression, nullOnDivideByZero: Boolean) extends GpuM2 with Product with Serializable
  262. case class GpuStddevSamp(child: Expression, nullOnDivideByZero: Boolean) extends GpuM2 with GpuReplaceWindowFunction with Product with Serializable
  263. case class GpuStringInstr(str: Expression, substr: Expression) extends BinaryExpression with GpuBinaryExpressionArgsAnyScalar with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  264. case class GpuStringLPad(str: Expression, len: Expression, pad: Expression) extends TernaryExpression with BasePad with Product with Serializable
  265. case class GpuStringLocate(substr: Expression, col: Expression, start: Expression) extends TernaryExpression with GpuTernaryExpressionArgsScalarAnyScalar with ImplicitCastInputTypes with Product with Serializable
  266. case class GpuStringRPad(str: Expression, len: Expression, pad: Expression) extends TernaryExpression with BasePad with Product with Serializable
  267. case class GpuStringRepeat(input: Expression, repeatTimes: Expression) extends BinaryExpression with GpuBinaryExpression with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  268. case class GpuStringReplace(srcExpr: Expression, searchExpr: Expression, replaceExpr: Expression) extends TernaryExpression with GpuTernaryExpressionArgsAnyScalarScalar with ImplicitCastInputTypes with HasGpuStringReplace with Product with Serializable
  269. case class GpuStringSplit(str: Expression, regex: Expression, limit: Expression, pattern: String, isRegExp: Boolean) extends TernaryExpression with GpuTernaryExpression with ImplicitCastInputTypes with Product with Serializable
  270. class GpuStringSplitMeta extends StringSplitRegExpMeta[StringSplit]
  271. case class GpuStringToMap(strExpr: Expression, pairDelimExpr: Expression, keyValueDelimExpr: Expression, pairDelim: String, isPairDelimRegExp: Boolean, keyValueDelim: String, isKeyValueDelimRegExp: Boolean) extends Expression with GpuExpression with ShimExpression with ExpectsInputTypes with Product with Serializable
  272. class GpuStringToMapMeta extends StringSplitRegExpMeta[StringToMap]
  273. case class GpuStringTranslate(srcExpr: Expression, fromExpr: Expression, toExpr: Expression) extends TernaryExpression with GpuTernaryExpressionArgsAnyScalarScalar with ImplicitCastInputTypes with Product with Serializable
  274. case class GpuStringTrim(column: Expression, trimParameters: Option[Expression] = None) extends Expression with GpuString2TrimExpression with ImplicitCastInputTypes with Product with Serializable
  275. case class GpuStringTrimLeft(column: Expression, trimParameters: Option[Expression] = None) extends Expression with GpuString2TrimExpression with ImplicitCastInputTypes with Product with Serializable
  276. case class GpuStringTrimRight(column: Expression, trimParameters: Option[Expression] = None) extends Expression with GpuString2TrimExpression with ImplicitCastInputTypes with Product with Serializable
  277. case class GpuSubstring(str: Expression, pos: Expression, len: Expression) extends TernaryExpression with GpuTernaryExpression with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  278. case class GpuSubstringIndex(strExpr: Expression, regexp: String, ignoredDelimExpr: Expression, ignoredCountExpr: Expression) extends TernaryExpression with GpuTernaryExpressionArgsAnyScalarScalar with ImplicitCastInputTypes with Product with Serializable
  279. case class GpuSubtract(left: Expression, right: Expression, failOnError: Boolean) extends GpuSubtractBase with Product with Serializable
  280. abstract class GpuSubtractBase extends CudfBinaryArithmetic with Serializable
  281. abstract class GpuSum extends Expression with GpuAggregateFunction with ImplicitCastInputTypes with GpuBatchedRunningWindowWithFixer with GpuAggregateWindowFunction with GpuRunningWindowFunction with Serializable
  282. case class GpuTan(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  283. case class GpuTanh(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  284. class GpuTaskMetrics extends Serializable
  285. abstract class GpuTimeMath extends BinaryExpression with ShimBinaryExpression with GpuExpression with TimeZoneAwareExpression with ExpectsInputTypes with Serializable
  286. trait GpuTimeUnaryExpression extends GpuUnaryExpression with TimeZoneAwareExpression with ImplicitCastInputTypes with NullIntolerant
  287. trait GpuToCpuAggregateBufferConverter extends AnyRef
  288. trait GpuToCpuBufferTransition extends UnaryExpression with ShimUnaryExpression with CodegenFallback
  289. class GpuToCpuCollectBufferConverter extends GpuToCpuAggregateBufferConverter
  290. case class GpuToCpuCollectBufferTransition(child: Expression) extends UnaryExpression with GpuToCpuBufferTransition with Product with Serializable
  291. case class GpuToDegrees(child: Expression) extends GpuUnaryMathExpression with Product with Serializable
  292. case class GpuToRadians(child: Expression) extends GpuUnaryMathExpression with Product with Serializable
  293. abstract class GpuToTimestamp extends BinaryExpression with GpuBinaryExpressionArgsAnyScalar with TimeZoneAwareExpression with ExpectsInputTypes

    A direct conversion of Spark's ToTimestamp class which converts time to UNIX timestamp by first converting to microseconds and then dividing by the downScaleFactor

  294. abstract class GpuToTimestampImproved extends GpuToTimestamp

    An improved version of GpuToTimestamp conversion which converts time to UNIX timestamp without first converting to microseconds

  295. case class GpuToUnixTimestamp(strTs: Expression, format: Expression, sparkFormat: String, strf: String, timeZoneId: Option[String] = None) extends GpuToTimestamp with Product with Serializable
  296. case class GpuToUnixTimestampImproved(strTs: Expression, format: Expression, sparkFormat: String, strf: String, timeZoneId: Option[String] = None) extends GpuToTimestampImproved with Product with Serializable
  297. abstract class GpuUnaryMathExpression extends GpuUnaryExpression with Serializable with ImplicitCastInputTypes
  298. case class GpuUnaryMinus(child: Expression, failOnError: Boolean) extends GpuUnaryExpression with ExpectsInputTypes with NullIntolerant with Product with Serializable
  299. case class GpuUnaryPositive(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with NullIntolerant with Product with Serializable
  300. abstract class GpuUnaryString2StringExpression extends GpuUnaryExpression with ExpectsInputTypes
  301. case class GpuUnixTimestamp(strTs: Expression, format: Expression, sparkFormat: String, strf: String, timeZoneId: Option[String] = None) extends GpuToTimestamp with Product with Serializable
  302. case class GpuUnixTimestampImproved(strTs: Expression, format: Expression, sparkFormat: String, strf: String, timeZoneId: Option[String] = None) extends GpuToTimestampImproved with Product with Serializable
  303. case class GpuUpper(child: Expression) extends GpuUnaryString2StringExpression with Product with Serializable
  304. case class GpuVariancePop(child: Expression, nullOnDivideByZero: Boolean) extends GpuM2 with Product with Serializable
  305. case class GpuVarianceSamp(child: Expression, nullOnDivideByZero: Boolean) extends GpuM2 with Product with Serializable
  306. case class GpuWeekDay(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  307. class GpuWriteJobDescription extends Serializable

    A shared job description for all the GPU write tasks.

    A shared job description for all the GPU write tasks. This is the GPU version of org.apache.spark.sql.execution.datasources.WriteJobDescription.

  308. class GpuWriteJobStatsTracker extends BasicColumnarWriteJobStatsTracker

    Simple ColumnarWriteJobStatsTracker implementation that's serializable, capable of instantiating GpuWriteTaskStatsTracker on executors and processing the WriteTaskStats they produce by aggregating the metrics and posting them as DriverMetricUpdates.

  309. class GpuWriteTaskStatsTracker extends BasicColumnarWriteTaskStatsTracker

    ColumnarWriteTaskStatsTracker implementation that produces WriteTaskStats and tracks writing times per task.

  310. case class GpuWriterBucketSpec(bucketIdExpression: Expression, bucketFileNamePrefix: (Int) ⇒ String) extends Product with Serializable

    Bucketing specification for all the write tasks.

    Bucketing specification for all the write tasks. This is the GPU version of org.apache.spark.sql.execution.datasources.WriterBucketSpec

    bucketIdExpression

    Expression to calculate bucket id based on bucket column(s).

    bucketFileNamePrefix

    Prefix of output file name based on bucket id.

  311. case class GpuXxHash64(children: Seq[Expression], seed: Long) extends GpuHashExpression with Product with Serializable
  312. case class GpuYear(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  313. trait HasGpuStringReplace extends AnyRef
  314. class InMemoryTableScanMeta extends SparkPlanMeta[InMemoryTableScanExec]
  315. class NanoSecondAccumulator extends AccumulatorV2[Long, NanoTime]
  316. case class NanoTime(value: Long) extends Product with Serializable
  317. case class ParseFormatMeta(separator: Option[Char], isTimestamp: Boolean, validRegex: String) extends Product with Serializable
  318. class ProxyRapidsShuffleInternalManagerBase extends RapidsShuffleManagerLike with Proxy

    A simple proxy wrapper allowing to delay loading of the real implementation to a later point when ShimLoader has already updated Spark classloaders.

  319. class RapidsCachingReader[K, C] extends ShuffleReader[K, C] with Logging
  320. class RapidsCachingWriter[K, V] extends ShuffleWriter[K, V] with Logging
  321. class RapidsDiskBlockManager extends AnyRef

    Maps logical blocks to local disk locations.

  322. abstract class RapidsShuffleInternalManagerBase extends ShuffleManager with RapidsShuffleHeartbeatHandler with Logging

    A shuffle manager optimized for the RAPIDS Plugin For Apache Spark.

    A shuffle manager optimized for the RAPIDS Plugin For Apache Spark.

    Note

    This is an internal class to obtain access to the private ShuffleManager and SortShuffleManager classes. When configuring Apache Spark to use the RAPIDS shuffle manager,

  323. trait RapidsShuffleManagerLike extends AnyRef

    Trait that makes it easy to check whether we are dealing with the a RAPIDS Shuffle Manager

  324. abstract class RapidsShuffleThreadedReaderBase[K, C] extends ShuffleReader[K, C] with Logging
  325. abstract class RapidsShuffleThreadedWriterBase[K, V] extends ShuffleWriter[K, V] with RapidsShuffleWriterShimHelper with Logging
  326. trait RapidsShuffleWriterShimHelper extends AnyRef
  327. case class RegexReplace(search: String, replace: String) extends Product with Serializable
  328. class ShimmedExecutionPlanCaptureCallbackImpl extends ExecutionPlanCaptureCallbackBase

    Note that the name is prefixed with "Shimmed" such that wildcard rules under unshimmed-common-from-spark311.txt don't get confused and pick this class to be un-shimmed.

  329. class ShuffleHandleWithMetrics[K, V, C] extends BaseShuffleHandle[K, V, C]
  330. trait ShuffleMetricsUpdater extends AnyRef
  331. abstract class StringSplitRegExpMeta[INPUT <: TernaryExpression] extends TernaryExprMeta[INPUT]
  332. class SubstringIndexMeta extends TernaryExprMeta[SubstringIndex]
  333. case class TempSpillBufferId extends RapidsBufferId with Product with Serializable
  334. class ThreadSafeShuffleWriteMetricsReporter extends ShuffleWriteMetrics

    The ShuffleWriteMetricsReporter is based on accumulators, which are not thread safe.

    The ShuffleWriteMetricsReporter is based on accumulators, which are not thread safe. This class is a thin wrapper that adds synchronization, since these metrics will be written by multiple threads.

  335. sealed trait TimeParserPolicy extends Serializable
  336. abstract class UnixTimeExprMeta[A <: BinaryExpression with TimeZoneAwareExpression] extends BinaryExprMeta[A]
  337. case class WindowStddevSamp(child: Expression, nullOnDivideByZero: Boolean) extends Expression with GpuAggregateWindowFunction with Product with Serializable
  338. case class WrappedAggFunction(aggregateFunction: GpuAggregateFunction, filter: Expression) extends Expression with GpuAggregateFunction with Product with Serializable

Value Members

  1. object AddOverflowChecks
  2. object BasicColumnarWriteJobStatsTracker extends Serializable
  3. object CorrectedTimeParserPolicy extends TimeParserPolicy
  4. object CudfAll

    Check if all values in a boolean column are trues.

    Check if all values in a boolean column are trues. The CUDF all aggregation does not work for reductions or group by aggregations so we use Min as a workaround for this.

  5. object CudfAny

    Check if there is a true value in a boolean column.

    Check if there is a true value in a boolean column. The CUDF any aggregation does not work for reductions or group by aggregations so we use Max as a workaround for this.

  6. object CudfNthLikeAggregate extends Serializable
  7. object CudfRegexp
  8. object DecimalDivideChecks
  9. object DecimalMultiplyChecks
  10. object ExceptionTimeParserPolicy extends TimeParserPolicy
  11. object ExecutionPlanCaptureCallback extends ExecutionPlanCaptureCallbackBase
  12. object ExternalSource extends Logging

    The subclass of AvroProvider imports spark-avro classes.

    The subclass of AvroProvider imports spark-avro classes. This file should not imports spark-avro classes because class not found exception may throw if spark-avro does not exist at runtime. Details see: https://github.com/NVIDIA/spark-rapids/issues/5648

  13. object GpuAnsi
  14. object GpuArrayMax extends Serializable
  15. object GpuArrayMin extends Serializable
  16. object GpuAverage extends Serializable
  17. object GpuAvroScan extends Serializable
  18. object GpuCreateMap extends Serializable
  19. object GpuDataSourceBase extends Logging
  20. object GpuDataSourceScanExec extends Serializable
  21. object GpuDecimalSumOverflow

    All decimal processing in Spark has overflow detection as a part of it.

    All decimal processing in Spark has overflow detection as a part of it. Either it replaces the value with a null in non-ANSI mode, or it throws an exception in ANSI mode. Spark will also do the processing for larger values as Decimal values which are based on BigDecimal and have unbounded precision. So in most cases it is impossible to overflow/underflow so much that an incorrect value is returned. Spark will just use more and more memory to hold the value and then check for overflow at some point when the result needs to be turned back into a 128-bit value.

    We cannot do the same thing. Instead we take three strategies to detect overflow.

    1. For decimal values with a precision of 8 or under we follow Spark and do the SUM on the unscaled value as a long, and then bit-cast the result back to a Decimal value. this means that we can SUM 174,467,442,481 maximum or minimum decimal values with a precision of 8 before overflow can no longer be detected. It is much higher for decimal values with a smaller precision. 2. For decimal values with a precision from 9 to 20 inclusive we sum them as 128-bit values. this is very similar to what we do in the first strategy. The main differences are that we use a 128-bit value when doing the sum, and we check for overflow after processing each batch. In the case of group-by and reduction that happens after the update stage and also after each merge stage. This gives us enough room that we can always detect overflow when summing a single batch. Even on a merge where we could be doing the aggregation on a batch that has all max output values in it. 3. For values from 21 to 28 inclusive we have enough room to not check for overflow on teh update aggregation, but for the merge aggregation we need to do some extra checks. This is done by taking the digits above 28 and sum them separately. We then check to see if they would have overflowed the original limits. This lets us detect overflow in cases where the original value would have wrapped around. The reason this works is because we have a hard limit on the maximum number of values in a single batch being processed. Int.MaxValue, or about 2.2 billion values. So we use a precision on the higher values that is large enough to handle 2.2 billion values and still detect overflow. This equates to a precision of about 10 more than is needed to hold the higher digits. This effectively gives us unlimited overflow detection. 4. For anything larger than precision 28 we do the same overflow detection for strategy 3, but also do it on the update aggregation. This lets us fully detect overflows in any stage of an aggregation.

    Note that for Window operations either there is no merge stage or it only has a single value being merged into a batch instead of an entire batch being merged together. This lets us handle the overflow detection with what is built into GpuAdd.

  22. object GpuDivModLike
  23. object GpuElementAtMeta
  24. object GpuFileFormatDataWriter
  25. object GpuFileFormatWriter extends Logging

    A helper object for writing columnar data out to a location.

  26. object GpuFileSourceScanExec extends Serializable
  27. object GpuFloorCeil
  28. object GpuHypot extends Serializable
  29. object GpuLogarithm extends Serializable
  30. object GpuMax extends Serializable
  31. object GpuMin extends Serializable
  32. object GpuMurmur3Hash extends Serializable
  33. object GpuOrcFileFormat extends Logging
  34. object GpuReadAvroFileFormat extends Serializable
  35. object GpuRegExpUtils
  36. object GpuScalaUDF extends Serializable
  37. object GpuScalaUDFMeta
  38. object GpuSequenceUtil
  39. object GpuShuffleEnv extends Logging
  40. object GpuSubstringIndex extends Serializable
  41. object GpuSum extends Serializable
  42. object GpuTaskMetrics extends Logging with Serializable

    Provides task level metrics

  43. object GpuToTimestamp
  44. object GpuV1WriteUtils
  45. object GpuWriteJobStatsTracker extends Serializable
  46. object InputFileUtils
  47. object LegacyTimeParserPolicy extends TimeParserPolicy
  48. object PCBSSchemaHelper
  49. object RapidsPrivateUtil
  50. object RapidsShuffleInternalManagerBase extends Logging
  51. object ShiftHelper
  52. object TempSpillBufferId extends Serializable

Ungrouped