class PropagateWatermarkSimulator extends WatermarkPropagator with Logging
This implementation simulates propagation of watermark among operators.
The simulation algorithm traverses the physical plan tree via post-order (children first) to calculate (input watermark, output watermark) for all nodes.
For each node, below logic is applied:
- Input watermark for specific node is decided by min(input watermarks from all children).
-- Children providing no input watermark (DEFAULT_WATERMARK_MS) are excluded.
-- If there is no valid input watermark from children, input watermark = DEFAULT_WATERMARK_MS.
- Output watermark for specific node is decided as following:
-- watermark nodes: origin watermark value
This could be individual origin watermark value, but we decide to retain global watermark
to keep the watermark model be simple.
-- stateless nodes: same as input watermark
-- stateful nodes: the return value of op.produceOutputWatermark(input watermark).
- See also
StateStoreWriter.produceOutputWatermark Note that this implementation will throw an exception if watermark node sees a valid input watermark from children, meaning that we do not support re-definition of watermark. Once the algorithm traverses the physical plan tree, the association between stateful operator and input watermark will be constructed. Spark will request the input watermark for specific stateful operator, which this implementation will give the value from the association. We skip simulation of propagation for the value of watermark as 0. Input watermark for every operator will be 0. (This may not be expected for the case op.produceOutputWatermark returns higher than the input watermark, but it won't happen in most practical cases.)
- Alphabetic
- By Inheritance
- PropagateWatermarkSimulator
- Logging
- WatermarkPropagator
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new PropagateWatermarkSimulator()
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def getInputWatermarkForEviction(batchId: Long, stateOpId: Long): Long
Provide the calculated input watermark for eviction for given stateful operator.
Provide the calculated input watermark for eviction for given stateful operator.
- Definition Classes
- PropagateWatermarkSimulator → WatermarkPropagator
- def getInputWatermarkForLateEvents(batchId: Long, stateOpId: Long): Long
Provide the calculated input watermark for late events for given stateful operator.
Provide the calculated input watermark for late events for given stateful operator.
- Definition Classes
- PropagateWatermarkSimulator → WatermarkPropagator
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- def initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- def log: Logger
- Attributes
- protected
- Definition Classes
- Logging
- def logDebug(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logDebug(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logName: String
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarning(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarning(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- def propagate(batchId: Long, plan: SparkPlan, originWatermark: Long): Unit
Request to propagate watermark among operators based on origin watermark value.
Request to propagate watermark among operators based on origin watermark value. The result should be input watermark per stateful operator, which Spark will request the value by calling getInputWatermarkXXX with operator ID.
It is recommended for implementation to cache the result, as Spark can request the propagation multiple times with the same batch ID and origin watermark value.
- Definition Classes
- PropagateWatermarkSimulator → WatermarkPropagator
- def purge(batchId: Long): Unit
Request to clean up cached result on propagation.
Request to clean up cached result on propagation. Spark will call this method when the given batch ID will be likely to be not re-executed.
- Definition Classes
- PropagateWatermarkSimulator → WatermarkPropagator
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()