Packages

package shims

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. case class AvoidAdaptiveTransitionToRow(child: SparkPlan) extends SparkPlan with ShimUnaryExecNode with GpuExec with Product with Serializable

    This operator will attempt to optimize the case when we are writing the results of an adaptive query to disk so that we remove the redundant transitions from columnar to row within AdaptiveSparkPlanExec followed by a row to columnar transition.

    This operator will attempt to optimize the case when we are writing the results of an adaptive query to disk so that we remove the redundant transitions from columnar to row within AdaptiveSparkPlanExec followed by a row to columnar transition.

    Specifically, this is the plan we see in this case:

    GpuRowToColumnar(AdaptiveSparkPlanExec(GpuColumnarToRow(child))

    We perform this optimization at runtime rather than during planning, because when the adaptive plan is being planned and executed, we don't know whether it is being called from an operation that wants rows (such as CollectTailExec) or from an operation that wants columns (such as GpuDataWritingCommandExec).

    Spark does not provide a mechanism for executing an adaptive plan and retrieving columnar results and the internal methods that we need to call are private, so we use reflection to call them.

    child

    The plan to execute

  2. class BatchScanExecMeta extends SparkPlanMeta[BatchScanExec]
  3. final class CreateDataSourceTableAsSelectCommandMeta extends DataWritingCommandMeta[CreateDataSourceTableAsSelectCommand]
  4. abstract class GetMapValueMeta extends BinaryExprMeta[GetMapValue]

    We define this type in the shim layer because GetMapValue doesn't have the field failOnError since Spark 3.4.0 and it always returns null on invalid access to map column in ANSI mode.

  5. class GpuAggregateInPandasExecMeta extends SparkPlanMeta[AggregateInPandasExec]
  6. case class GpuBatchScanExec(output: Seq[AttributeReference], scan: GpuScan) extends SparkPlan with DataSourceV2ScanExecBase with GpuBatchScanExecMetrics with Product with Serializable
  7. abstract class GpuBroadcastJoinMeta[INPUT <: SparkPlan] extends SparkPlanMeta[INPUT]
  8. trait GpuCreateHiveTableAsSelectBase extends LogicalPlan with GpuDataWritingCommand

    GPU version of Spark's CreateHiveTableAsSelectBase

  9. class GpuCustomShuffleReaderMeta extends SparkPlanMeta[CustomShuffleReaderExec]
  10. class GpuDataSourceRDD extends DataSourceRDD

    A replacement for DataSourceRDD that does NOT compute the bytes read input metric.

    A replacement for DataSourceRDD that does NOT compute the bytes read input metric. DataSourceRDD assumes all reads occur on the task thread, and some GPU input sources use multithreaded readers that cannot generate proper metrics with DataSourceRDD.

    Note

    It is the responsibility of users of this RDD to generate the bytes read input metric explicitly!

  11. trait GpuDeterministicFirstLastCollectShim extends Expression
  12. case class GpuHashPartitioning(expressions: Seq[Expression], numPartitions: Int) extends GpuHashPartitioningBase with Product with Serializable
  13. case class GpuOptimizedCreateHiveTableAsSelectCommand(tableDesc: CatalogTable, query: LogicalPlan, outputColumnNames: Seq[String], mode: SaveMode, cpuCmd: OptimizedCreateHiveTableAsSelectCommand) extends LogicalPlan with GpuCreateHiveTableAsSelectBase with Product with Serializable
  14. class GpuOrcDataReader extends DataReader

    File cache is not supported for Spark 3.1.x so this is a thin wrapper around the ORC DataReader.

  15. case class GpuRangePartitioning(gpuOrdering: Seq[SortOrder], numPartitions: Int) extends Expression with GpuExpression with ShimExpression with GpuPartitioning with Product with Serializable

    A GPU accelerated org.apache.spark.sql.catalyst.plans.physical.Partitioning that partitions sortable records by range into roughly equal ranges.

    A GPU accelerated org.apache.spark.sql.catalyst.plans.physical.Partitioning that partitions sortable records by range into roughly equal ranges. The ranges are determined by sampling the content of the RDD passed in.

    Note

    The actual number of partitions created might not be the same as the numPartitions parameter, in the case where the number of sampled records is less than the value of partitions. The GpuRangePartitioner is where all of the processing actually happens.

  16. class GpuSpecifiedWindowFrameMeta extends GpuSpecifiedWindowFrameMetaBase
  17. class GpuWindowExpressionMeta extends GpuWindowExpressionMetaBase
  18. case class GpuWindowInPandasExec(windowExpression: Seq[Expression], gpuPartitionSpec: Seq[Expression], cpuOrderSpec: Seq[SortOrder], child: SparkPlan)(cpuPartitionSpec: Seq[Expression]) extends SparkPlan with GpuWindowInPandasExecBase with Product with Serializable
  19. abstract class OffsetWindowFunctionMeta[INPUT <: OffsetWindowFunction] extends ExprMeta[INPUT]

    Spark 3.1.1-specific replacement for com.nvidia.spark.rapids.OffsetWindowFunctionMeta.

    Spark 3.1.1-specific replacement for com.nvidia.spark.rapids.OffsetWindowFunctionMeta. This is required primarily for two reasons:

    1. com.nvidia.spark.rapids.OffsetWindowFunctionMeta (compiled against Spark 3.0.x) fails class load in Spark 3.1.x. (expr.input is not recognized as an Expression.) 2. The semantics of offsets in LAG() are reversed/negated in Spark 3.1.1. E.g. The expression LAG(col, 5) causes Lag.offset to be set to -5, as opposed to 5, in prior versions of Spark. This class adjusts the LAG offset to use similar semantics to Spark 3.0.x.
  20. final class OptimizedCreateHiveTableAsSelectCommandMeta extends DataWritingCommandMeta[OptimizedCreateHiveTableAsSelectCommand]
  21. class OrcProtoWriterShim extends AnyRef
  22. trait OrcShims311until320Base extends AnyRef
  23. class PlanShimsImpl extends PlanShims
  24. class RapidsOrcScanMeta extends ScanMeta[OrcScan]
  25. class RapidsParquetScanMeta extends ScanMeta[ParquetScan]
  26. trait ShimBaseSubqueryExec extends BaseSubqueryExec
  27. trait ShimBinaryExecNode extends SparkPlan with BinaryExecNode
  28. trait ShimBinaryExpression extends BinaryExpression
  29. trait ShimBroadcastExchangeLike extends Exchange with BroadcastExchangeLike

    This shim handles the completion future differences between Apache Spark and Databricks.

  30. trait ShimExpression extends Expression
  31. abstract class ShimFilePartitionReaderFactory extends FilePartitionReaderFactory
  32. trait ShimGetArrayItem extends Expression with ExtractValue
  33. trait ShimGetArrayStructFields extends Expression with ExtractValue
  34. trait ShimGetStructField extends Expression with ExtractValue
  35. trait ShimLeafExecNode extends SparkPlan with LeafExecNode
  36. trait ShimPredicateHelper extends PredicateHelper
  37. trait ShimSparkPlan extends SparkPlan
  38. trait ShimSupportsRuntimeFiltering extends AnyRef

    Shim interface for Apache Spark's SupportsRuntimeFiltering interface which was added in Spark 3.2.0.

  39. trait ShimTernaryExpression extends TernaryExpression
  40. trait ShimUnaryCommand extends LogicalPlan with Command
  41. trait ShimUnaryExecNode extends SparkPlan with UnaryExecNode
  42. trait ShimUnaryExpression extends UnaryExpression
  43. abstract class Spark31XShims extends Spark31Xuntil33XShims with Logging
  44. trait Spark31Xuntil33XShims extends SparkShims

Value Members

  1. object AQEUtils

    Utility methods for manipulating Catalyst classes involved in Adaptive Query Execution

  2. object AggregationTagging
  3. object AnsiCastShim
  4. object AnsiUtil
  5. object BloomFilterShims
  6. object CastCheckShims
  7. object CastingConfigShim
  8. object CharVarcharUtilsShims
  9. object ColumnDefaultValuesShims
  10. object DateTimeUtilsShims
  11. object DecimalArithmeticOverrides
  12. object DecimalMultiply128
  13. object DeltaLakeUtils
  14. object DistributionUtil
  15. object FileIndexOptionsShims
  16. object GetSequenceSize
  17. object GlobalLimitShims
  18. object GpuCastShims
  19. object GpuDataSourceRDD extends Serializable
  20. object GpuFileFormatDataWriterShim
  21. object GpuHashPartitioning extends Serializable
  22. object GpuIntervalUtils

    Should not support in this Shim

  23. object GpuOrcDataReader
  24. object GpuParquetCrypto
  25. object GpuTypeShims
  26. object GpuWindowUtil
  27. object HashUtils
  28. object InSubqueryShims
  29. object LegacyBehaviorPolicyShim
  30. object NullOutputStreamShim
  31. object OrcCastingShims
  32. object OrcProtoWriterShim
  33. object OrcReadingShims
  34. object OrcShims extends OrcShims311until320Base
  35. object ParquetFieldIdShims
  36. object ParquetLegacyNanoAsLongShims
  37. object ParquetSchemaClipShims
  38. object ParquetStringPredShims
  39. object ParquetTimestampNTZShims
  40. object PartitionedFileUtilsShim
  41. object PythonUDFShim
  42. object RapidsFileSourceMetaUtils
  43. object ShuffleOriginUtil
  44. object SparkShimImpl extends Spark31XShims
  45. object TypeSigUtil extends TypeSigUtilBase

    TypeSig Support for [3.1.1, 3.2.0)

  46. object TypeUtilsShims

    Reimplement the function checkForNumericExpr which has been removed since Spark 3.4.0

  47. object XxHash64Shims
  48. object YearParseUtil

Ungrouped