public class Graph extends Object
Workflow
object is transformed to an intermediate graph (an object of this class), and all control nodes are generated.
This graph is later further transformed to JAXB objects and to xml.
The conversion from the API level Workflow object to the intermediate graph is as follows:
We take the nodes in topological order, meaning every node is processed after all of its dependencies
have been processed. There are two main possibilities when processing a node:
- the node has zero or one parent
- the node has at least two parents.
In the first case, we simply add the converted node as a child to its parent (or the start node if there are none),
possibly inserting a fork if the parent already has children, or using a pre-existing fork if a parent already
has one.
In the second case, we have to insert a join. We first check if we can join all incoming paths in a single join
node or if we have to split them up because they come from multiple embedded forks. It is also possible that some
incoming paths originate from the same fork but that fork has other outgoing paths as well. In that case we split
the fork up into multiple embedded forks.
After this, we examine all paths that we are going to join and look for side branches that lead out of the
fork / join block, violating Oozie's constraints. If these are non-conditional branches, we simply cut them down
from their original parents and put them under the new join (and possibly under a fork), and make them siblings
of whatever nodes originally come after the join. This way all original dependencies are preserved, as the original
parents will still be ancestors (though indirectly) to the relocated nodes, but new dependencies are introduced.
This preserves the correctness of the workflow but decreases its parallelism. This is unfortunate but Oozie's graph
format is more restrictive than a general DAG, so we have to accept it.
If the side branches are conditional, we cut above the decision node and insert a join there. We reinsert the
decision node under the new join. This is very similar to the handling of non-conditional paths, but it
decreases parallelism even more (we cut one level higher).
A problem occurs if two or more decision nodes come right after the fork that we want to close. If we cut above
the decision nodes as usual we gain nothing, because we insert a join and a fork and arrive at the same situation
as before - multiple decision nodes under a fork. Currently, we are not able to handle this situation and we throw
an exception.| Modifier and Type | Method and Description |
|---|---|
Credentials |
getCredentials() |
End |
getEnd()
Returns the end node of this graph.
|
Global |
getGlobal() |
String |
getName()
Returns the name of this graph.
|
NodeBase |
getNodeByName(String name)
Returns the node with the given name in this graph if it exists,
null otherwise. |
Collection<NodeBase> |
getNodes()
Returns a collection of the nodes in this graph.
|
Parameters |
getParameters() |
Start |
getStart()
Returns the start node of this graph.
|
public String getName()
public Parameters getParameters()
public Global getGlobal()
public Start getStart()
public End getEnd()
public NodeBase getNodeByName(String name)
null otherwise.name - The name of the node that will be returned.null otherwise.public Collection<NodeBase> getNodes()
public Credentials getCredentials()
Copyright © 2021 Apache Software Foundation. All rights reserved.