object NormalizeFloatingNumbers extends Rule[LogicalPlan]
We need to take care of special floating numbers (NaN and -0.0) in several places:
- When compare values, different NaNs should be treated as same,
-0.0and0.0should be treated as same. 2. In aggregate grouping keys, different NaNs should belong to the same group, -0.0 and 0.0 should belong to the same group. 3. In join keys, different NaNs should be treated as same,-0.0and0.0should be treated as same. 4. In window partition keys, different NaNs should belong to the same partition, -0.0 and 0.0 should belong to the same partition.
Case 1 is fine, as we handle NaN and -0.0 well during comparison. For complex types, we recursively compare the fields/elements, so it's also fine.
Case 2, 3 and 4 are problematic, as Spark SQL turns grouping/join/window partition keys into
binary UnsafeRow and compare the binary data directly. Different NaNs have different binary
representation, and the same thing happens for -0.0 and 0.0.
This rule normalizes NaN and -0.0 in window partition keys, join keys and aggregate grouping keys.
Ideally we should do the normalization in the physical operators that compare the
binary UnsafeRow directly. We don't need this normalization if the Spark SQL execution engine
is not optimized to run on binary data. This rule is created to simplify the implementation, so
that we have a single place to do normalization, which is more maintainable.
Note that, this rule must be executed at the end of optimizer, because the optimizer may create new joins(the subquery rewrite) and new join conditions(the join reorder).
- Alphabetic
- By Inheritance
- NormalizeFloatingNumbers
- Rule
- Logging
- SQLConfHelper
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- val DOUBLE_NORMALIZER: (Any) => Any
- val FLOAT_NORMALIZER: (Any) => Any
- def apply(plan: LogicalPlan): LogicalPlan
- Definition Classes
- NormalizeFloatingNumbers → Rule
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- def conf: SQLConf
The active config object within the current scope.
The active config object within the current scope. See SQLConf.get for more information.
- Definition Classes
- SQLConfHelper
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- def initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- def log: Logger
- Attributes
- protected
- Definition Classes
- Logging
- def logDebug(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logDebug(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logName: String
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarning(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarning(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- lazy val ruleId: RuleId
- Attributes
- protected
- Definition Classes
- Rule
- val ruleName: String
Name for this rule, automatically inferred based on class name.
Name for this rule, automatically inferred based on class name.
- Definition Classes
- Rule
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()