Packages

class SymmetricHashJoinStateManager extends Logging

Helper class to manage state required by a single side of org.apache.spark.sql.execution.streaming.StreamingSymmetricHashJoinExec. The interface of this class is basically that of a multi-map: - Get: Returns an iterator of multiple values for given key - Append: Append a new value to the given key - Remove Data by predicate: Drop any state using a predicate condition on keys or values

Linear Supertypes
Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. SymmetricHashJoinStateManager
  2. Logging
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Instance Constructors

  1. new SymmetricHashJoinStateManager(joinSide: JoinSide, inputValueAttributes: Seq[Attribute], joinKeys: Seq[Expression], stateInfo: Option[StatefulOperatorStateInfo], storeConf: StateStoreConf, hadoopConf: Configuration, partitionId: Int, stateFormatVersion: Int)

    joinSide

    Defines the join side

    inputValueAttributes

    Attributes of the input row which will be stored as value

    joinKeys

    Expressions to generate rows that will be used to key the value rows

    stateInfo

    Information about how to retrieve the correct version of state

    storeConf

    Configuration for the state store.

    hadoopConf

    Hadoop configuration for reading state data from storage

    partitionId

    A partition ID of source RDD.

    stateFormatVersion

    The version of format for state. Internally, the key -> multiple values is stored in two StateStores. - Store 1 (KeyToNumValuesStore) maintains mapping between key -> number of values - Store 2 (KeyWithIndexToValueStore) maintains mapping; the mapping depends on the state format version:

    • version 1: [(key, index) -> value]
    • version 2: [(key, index) -> (value, matched)] - Put: update count in KeyToNumValuesStore, insert new (key, count) -> value in KeyWithIndexToValueStore - Get: read count from KeyToNumValuesStore, read each of the n values in KeyWithIndexToValueStore - Remove state by predicate on keys: scan all keys in KeyToNumValuesStore to find keys that do match the predicate, delete from key from KeyToNumValuesStore, delete values in KeyWithIndexToValueStore - Remove state by condition on values: scan all elements in KeyWithIndexToValueStore to find values that match the predicate, delete corresponding (key, indexToDelete) from KeyWithIndexToValueStore by overwriting with the value of (key, maxIndex), and removing [(key, maxIndex), decrement corresponding num values in KeyToNumValuesStore

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def abortIfNeeded(): Unit

    Abort any changes to the state stores if needed

  5. def append(key: UnsafeRow, value: UnsafeRow, matched: Boolean): Unit

    Append a new value to the key

  6. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  7. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native()
  8. def commit(): Unit

    Commit all the changes to all the state stores

  9. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  10. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  11. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable])
  12. def get(key: UnsafeRow): Iterator[UnsafeRow]

    Get all the values of a key

  13. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  14. def getInternalRowOfKeyWithIndex(currentKey: UnsafeRow): InternalRow

    Projects the key of unsafe row to internal row for printable log message.

  15. def getJoinedRows(key: UnsafeRow, generateJoinedRow: (InternalRow) => JoinedRow, predicate: (JoinedRow) => Boolean, excludeRowsAlreadyMatched: Boolean = false): Iterator[JoinedRow]

    Get all the matched values for given join condition, with marking matched.

    Get all the matched values for given join condition, with marking matched. This method is designed to mark joined rows properly without exposing internal index of row.

    excludeRowsAlreadyMatched

    Do not join with rows already matched previously. This is used for right side of left semi join in StreamingSymmetricHashJoinExec only.

  16. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  17. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  18. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  19. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  20. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  21. val joinSide: JoinSide
  22. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  23. def logDebug(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  24. def logDebug(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  25. def logError(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  26. def logError(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  27. def logInfo(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  28. def logInfo(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  29. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  30. def logTrace(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  31. def logTrace(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  32. def logWarning(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  33. def logWarning(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  34. def metrics: StateStoreMetrics

    Get the combined metrics of all the state stores

  35. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  36. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  37. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  38. def removeByKeyCondition(removalCondition: (UnsafeRow) => Boolean): Iterator[KeyToValuePair]

    Remove using a predicate on keys.

    Remove using a predicate on keys.

    This produces an iterator over the (key, value, matched) tuples satisfying condition(key), where the underlying store is updated as a side-effect of producing next.

    This implies the iterator must be consumed fully without any other operations on this manager or the underlying store being interleaved.

  39. def removeByValueCondition(removalCondition: (UnsafeRow) => Boolean): Iterator[KeyToValuePair]

    Remove using a predicate on values.

    Remove using a predicate on values.

    At a high level, this produces an iterator over the (key, value, matched) tuples such that value satisfies the predicate, where producing an element removes the value from the state store and producing all elements with a given key updates it accordingly.

    This implies the iterator must be consumed fully without any other operations on this manager or the underlying store being interleaved.

  40. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  41. def toString(): String
    Definition Classes
    AnyRef → Any
  42. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  43. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  44. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped