class PropagateWatermarkSimulator extends WatermarkPropagator with Logging
This implementation simulates propagation of watermark among operators.
The simulation algorithm traverses the physical plan tree via post-order (children first) to calculate (input watermark, output watermark) for all nodes.
For each node, below logic is applied:
- Input watermark for specific node is decided by min(input watermarks from all children).
-- Children providing no input watermark (DEFAULT_WATERMARK_MS) are excluded.
-- If there is no valid input watermark from children, input watermark = DEFAULT_WATERMARK_MS.
- Output watermark for specific node is decided as following:
-- watermark nodes: origin watermark value
This could be individual origin watermark value, but we decide to retain global watermark
to keep the watermark model be simple.
-- stateless nodes: same as input watermark
-- stateful nodes: the return value of op.produceOutputWatermark(input watermark).
- See also
StateStoreWriter.produceOutputWatermark Note that this implementation will throw an exception if watermark node sees a valid input watermark from children, meaning that we do not support re-definition of watermark. Once the algorithm traverses the physical plan tree, the association between stateful operator and input watermark will be constructed. Spark will request the input watermark for specific stateful operator, which this implementation will give the value from the association. We skip simulation of propagation for the value of watermark as 0. Input watermark for every operator will be 0. (This may not be expected for the case op.produceOutputWatermark returns higher than the input watermark, but it won't happen in most practical cases.)
- Alphabetic
- By Inheritance
- PropagateWatermarkSimulator
- Logging
- WatermarkPropagator
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new PropagateWatermarkSimulator()
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getInputWatermarkForEviction(batchId: Long, stateOpId: Long): Long
Provide the calculated input watermark for eviction for given stateful operator.
Provide the calculated input watermark for eviction for given stateful operator.
- Definition Classes
- PropagateWatermarkSimulator → WatermarkPropagator
-
def
getInputWatermarkForLateEvents(batchId: Long, stateOpId: Long): Long
Provide the calculated input watermark for late events for given stateful operator.
Provide the calculated input watermark for late events for given stateful operator.
- Definition Classes
- PropagateWatermarkSimulator → WatermarkPropagator
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
propagate(batchId: Long, plan: SparkPlan, originWatermark: Long): Unit
Request to propagate watermark among operators based on origin watermark value.
Request to propagate watermark among operators based on origin watermark value. The result should be input watermark per stateful operator, which Spark will request the value by calling getInputWatermarkXXX with operator ID.
It is recommended for implementation to cache the result, as Spark can request the propagation multiple times with the same batch ID and origin watermark value.
- Definition Classes
- PropagateWatermarkSimulator → WatermarkPropagator
-
def
purge(batchId: Long): Unit
Request to clean up cached result on propagation.
Request to clean up cached result on propagation. Spark will call this method when the given batch ID will be likely to be not re-executed.
- Definition Classes
- PropagateWatermarkSimulator → WatermarkPropagator
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()