package shims
- Alphabetic
- Public
- All
Type Members
-
case class
AvoidAdaptiveTransitionToRow(child: SparkPlan) extends SparkPlan with ShimUnaryExecNode with GpuExec with Product with Serializable
This operator will attempt to optimize the case when we are writing the results of an adaptive query to disk so that we remove the redundant transitions from columnar to row within AdaptiveSparkPlanExec followed by a row to columnar transition.
This operator will attempt to optimize the case when we are writing the results of an adaptive query to disk so that we remove the redundant transitions from columnar to row within AdaptiveSparkPlanExec followed by a row to columnar transition.
Specifically, this is the plan we see in this case:
GpuRowToColumnar(AdaptiveSparkPlanExec(GpuColumnarToRow(child))
We perform this optimization at runtime rather than during planning, because when the adaptive plan is being planned and executed, we don't know whether it is being called from an operation that wants rows (such as CollectTailExec) or from an operation that wants columns (such as GpuDataWritingCommandExec).
Spark does not provide a mechanism for executing an adaptive plan and retrieving columnar results and the internal methods that we need to call are private, so we use reflection to call them.
- child
The plan to execute
- class BatchScanExecMeta extends SparkPlanMeta[BatchScanExec]
- final class CreateDataSourceTableAsSelectCommandMeta extends DataWritingCommandMeta[CreateDataSourceTableAsSelectCommand]
-
abstract
class
GetMapValueMeta extends BinaryExprMeta[GetMapValue]
We define this type in the shim layer because
GetMapValuedoesn't have the fieldfailOnErrorsince Spark 3.4.0 and it always returnsnullon invalid access to map column in ANSI mode. - class GpuAggregateInPandasExecMeta extends SparkPlanMeta[AggregateInPandasExec]
- case class GpuBatchScanExec(output: Seq[AttributeReference], scan: GpuScan) extends SparkPlan with DataSourceV2ScanExecBase with GpuBatchScanExecMetrics with Product with Serializable
- abstract class GpuBroadcastJoinMeta[INPUT <: SparkPlan] extends SparkPlanMeta[INPUT]
-
trait
GpuCreateHiveTableAsSelectBase extends LogicalPlan with GpuDataWritingCommand
GPU version of Spark's CreateHiveTableAsSelectBase
- class GpuCustomShuffleReaderMeta extends SparkPlanMeta[CustomShuffleReaderExec]
-
class
GpuDataSourceRDD extends DataSourceRDD
A replacement for DataSourceRDD that does NOT compute the bytes read input metric.
A replacement for DataSourceRDD that does NOT compute the bytes read input metric. DataSourceRDD assumes all reads occur on the task thread, and some GPU input sources use multithreaded readers that cannot generate proper metrics with DataSourceRDD.
- Note
It is the responsibility of users of this RDD to generate the bytes read input metric explicitly!
- trait GpuDeterministicFirstLastCollectShim extends Expression
- case class GpuHashPartitioning(expressions: Seq[Expression], numPartitions: Int) extends GpuHashPartitioningBase with Product with Serializable
- case class GpuOptimizedCreateHiveTableAsSelectCommand(tableDesc: CatalogTable, query: LogicalPlan, outputColumnNames: Seq[String], mode: SaveMode, cpuCmd: OptimizedCreateHiveTableAsSelectCommand) extends LogicalPlan with GpuCreateHiveTableAsSelectBase with Product with Serializable
-
class
GpuOrcDataReader extends DataReader
File cache is not supported for Spark 3.1.x so this is a thin wrapper around the ORC DataReader.
-
case class
GpuRangePartitioning(gpuOrdering: Seq[SortOrder], numPartitions: Int) extends Expression with GpuExpression with ShimExpression with GpuPartitioning with Product with Serializable
A GPU accelerated
org.apache.spark.sql.catalyst.plans.physical.Partitioningthat partitions sortable records by range into roughly equal ranges.A GPU accelerated
org.apache.spark.sql.catalyst.plans.physical.Partitioningthat partitions sortable records by range into roughly equal ranges. The ranges are determined by sampling the content of the RDD passed in.- Note
The actual number of partitions created might not be the same as the
numPartitionsparameter, in the case where the number of sampled records is less than the value ofpartitions. The GpuRangePartitioner is where all of the processing actually happens.
- class GpuSpecifiedWindowFrameMeta extends GpuSpecifiedWindowFrameMetaBase
- class GpuWindowExpressionMeta extends GpuWindowExpressionMetaBase
- case class GpuWindowInPandasExec(windowExpression: Seq[Expression], gpuPartitionSpec: Seq[Expression], cpuOrderSpec: Seq[SortOrder], child: SparkPlan)(cpuPartitionSpec: Seq[Expression]) extends SparkPlan with GpuWindowInPandasExecBase with Product with Serializable
-
abstract
class
OffsetWindowFunctionMeta[INPUT <: OffsetWindowFunction] extends ExprMeta[INPUT]
Spark 3.1.1-specific replacement for com.nvidia.spark.rapids.OffsetWindowFunctionMeta.
Spark 3.1.1-specific replacement for com.nvidia.spark.rapids.OffsetWindowFunctionMeta. This is required primarily for two reasons:
- com.nvidia.spark.rapids.OffsetWindowFunctionMeta (compiled against Spark 3.0.x)
fails class load in Spark 3.1.x. (
expr.inputis not recognized as an Expression.) 2. The semantics of offsets in LAG() are reversed/negated in Spark 3.1.1. E.g. The expressionLAG(col, 5)causes Lag.offset to be set to-5, as opposed to5, in prior versions of Spark. This class adjusts the LAG offset to use similar semantics to Spark 3.0.x.
- com.nvidia.spark.rapids.OffsetWindowFunctionMeta (compiled against Spark 3.0.x)
fails class load in Spark 3.1.x. (
- final class OptimizedCreateHiveTableAsSelectCommandMeta extends DataWritingCommandMeta[OptimizedCreateHiveTableAsSelectCommand]
- class OrcProtoWriterShim extends AnyRef
- trait OrcShims311until320Base extends AnyRef
- class PlanShimsImpl extends PlanShims
- class RapidsOrcScanMeta extends ScanMeta[OrcScan]
- class RapidsParquetScanMeta extends ScanMeta[ParquetScan]
- trait ShimBaseSubqueryExec extends BaseSubqueryExec
- trait ShimBinaryExecNode extends SparkPlan with BinaryExecNode
- trait ShimBinaryExpression extends BinaryExpression
-
trait
ShimBroadcastExchangeLike extends Exchange with BroadcastExchangeLike
This shim handles the completion future differences between Apache Spark and Databricks.
- trait ShimExpression extends Expression
- abstract class ShimFilePartitionReaderFactory extends FilePartitionReaderFactory
- trait ShimGetArrayItem extends Expression with ExtractValue
- trait ShimGetArrayStructFields extends Expression with ExtractValue
- trait ShimGetStructField extends Expression with ExtractValue
- trait ShimLeafExecNode extends SparkPlan with LeafExecNode
- trait ShimPredicateHelper extends PredicateHelper
- trait ShimSparkPlan extends SparkPlan
-
trait
ShimSupportsRuntimeFiltering extends AnyRef
Shim interface for Apache Spark's SupportsRuntimeFiltering interface which was added in Spark 3.2.0.
- trait ShimTernaryExpression extends TernaryExpression
- trait ShimUnaryCommand extends LogicalPlan with Command
- trait ShimUnaryExecNode extends SparkPlan with UnaryExecNode
- trait ShimUnaryExpression extends UnaryExpression
- abstract class Spark31XShims extends Spark31Xuntil33XShims with Logging
- trait Spark31Xuntil33XShims extends SparkShims
Value Members
-
object
AQEUtils
Utility methods for manipulating Catalyst classes involved in Adaptive Query Execution
- object AggregationTagging
- object AnsiCastShim
- object AnsiUtil
- object BloomFilterShims
- object CastCheckShims
- object CastingConfigShim
- object CharVarcharUtilsShims
- object ColumnDefaultValuesShims
- object DecimalArithmeticOverrides
- object DecimalMultiply128
- object DeltaLakeUtils
- object DistributionUtil
- object FileIndexOptionsShims
- object GetSequenceSize
- object GlobalLimitShims
- object GpuCastShims
- object GpuDataSourceRDD extends Serializable
- object GpuFileFormatDataWriterShim
- object GpuHashPartitioning extends Serializable
-
object
GpuIntervalUtils
Should not support in this Shim
- object GpuOrcDataReader
- object GpuParquetCrypto
- object GpuTypeShims
- object GpuWindowUtil
- object HashUtils
- object InSubqueryShims
- object LegacyBehaviorPolicyShim
- object NullOutputStreamShim
- object OrcCastingShims
- object OrcProtoWriterShim
- object OrcReadingShims
- object OrcShims extends OrcShims311until320Base
- object ParquetFieldIdShims
- object ParquetLegacyNanoAsLongShims
- object ParquetSchemaClipShims
- object ParquetStringPredShims
- object ParquetTimestampNTZShims
- object PartitionedFileUtilsShim
- object PythonUDFShim
- object RapidsFileSourceMetaUtils
- object ReaderUtils
- object ShuffleOriginUtil
- object SparkShimImpl extends Spark31XShims
-
object
TypeSigUtil extends TypeSigUtilBase
TypeSig Support for [3.1.1, 3.2.0)
-
object
TypeUtilsShims
Reimplement the function
checkForNumericExprwhich has been removed since Spark 3.4.0 - object XxHash64Shims
- object YearParseUtil