org.apache.spark.sql.catalyst.plans.physical
KeyGroupedPartitioning
Companion object KeyGroupedPartitioning
case class KeyGroupedPartitioning(expressions: Seq[Expression], numPartitions: Int, partitionValues: Seq[InternalRow] = Seq.empty) extends Partitioning with Product with Serializable
Represents a partitioning where rows are split across partitions based on transforms defined
by expressions. partitionValuesOpt, if defined, should contain value of partition key(s) in
ascending order, after evaluated by the transforms in expressions, for each input partition.
In addition, its length must be the same as the number of input partitions (and thus is a 1-1
mapping). The partitionValues may contain duplicated partition values.
For example, if expressions is [years(ts_col)], then a valid value of partitionValuesOpt is
[0, 1, 2], which represents 3 input partitions with distinct partition values. All rows
in each partition have the same value for column ts_col (which is of timestamp type), after
being applied by the years transform.
On the other hand, [0, 0, 1] is not a valid value for partitionValuesOpt since 0 is
duplicated twice.
- expressions
partition expressions for the partitioning.
- numPartitions
the number of partitions
- partitionValues
the values for the cluster keys of the distribution, must be in ascending order.
- Alphabetic
- By Inheritance
- KeyGroupedPartitioning
- Serializable
- Serializable
- Product
- Equals
- Partitioning
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
KeyGroupedPartitioning(expressions: Seq[Expression], numPartitions: Int, partitionValues: Seq[InternalRow] = Seq.empty)
- expressions
partition expressions for the partitioning.
- numPartitions
the number of partitions
- partitionValues
the values for the cluster keys of the distribution, must be in ascending order.
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
createShuffleSpec(distribution: ClusteredDistribution): ShuffleSpec
Creates a shuffle spec for this partitioning and its required distribution.
Creates a shuffle spec for this partitioning and its required distribution. The spec is used in the scenario where an operator has multiple children (e.g., join), and is used to decide whether this child is co-partitioned with others, therefore whether extra shuffle shall be introduced.
- distribution
the required clustered distribution for this partitioning
- Definition Classes
- KeyGroupedPartitioning → Partitioning
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- val expressions: Seq[Expression]
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
val
numPartitions: Int
Returns the number of partitions that the data is split across
Returns the number of partitions that the data is split across
- Definition Classes
- KeyGroupedPartitioning → Partitioning
- val partitionValues: Seq[InternalRow]
-
final
def
satisfies(required: Distribution): Boolean
Returns true iff the guarantees made by this Partitioning are sufficient to satisfy the partitioning scheme mandated by the
requiredDistribution, i.e.Returns true iff the guarantees made by this Partitioning are sufficient to satisfy the partitioning scheme mandated by the
requiredDistribution, i.e. the current dataset does not need to be re-partitioned for therequiredDistribution (it is possible that tuples within a partition need to be reorganized).A Partitioning can never satisfy a Distribution if its
numPartitionsdoesn't match Distribution.requiredNumPartitions.- Definition Classes
- Partitioning
-
def
satisfies0(required: Distribution): Boolean
The actual method that defines whether this Partitioning can satisfy the given Distribution, after the
numPartitionscheck.The actual method that defines whether this Partitioning can satisfy the given Distribution, after the
numPartitionscheck.By default a Partitioning can satisfy UnspecifiedDistribution, and AllTuples if the Partitioning only have one partition. Implementations can also overwrite this method with special logic.
- Definition Classes
- KeyGroupedPartitioning → Partitioning
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
- lazy val uniquePartitionValues: Seq[InternalRow]
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()