Packages

  • package root
    Definition Classes
    root
  • package io
    Definition Classes
    root
  • package shiftleft
    Definition Classes
    io
  • package semanticcpg

    Domain specific language for querying code property graphs

    Domain specific language for querying code property graphs

    This is the API reference for the CPG query language, a language to mine code for defects and vulnerabilities both interactively on a code analysis shell (REPL), or using non-interactive scripts.

    Queries written in the CPG query language express graph traversals (see https://en.wikipedia.org/wiki/Graph_traversal). Similar to the standard graph traversal language "Gremlin" (see https://en.wikipedia.org/wiki/Gremlin_(programming_language))) these traversals are formulated as sequences of primitive language elements referred to as "steps". You can think of a step as a small program, similar to a unix shell utility, however, instead of processing lines one by one, the step processes nodes of the graph.

    Starting a traversal

    All traversals begin by selecting a set of start nodes, e.g.,

    cpg.method

    will start the traversal at all methods, while

    cpg.local

    will start at all local variables. The complete list of starting points can be found at

    io.shiftleft.codepropertygraph.Cpg

    Lazy evaluation

    Queries are lazily evaluated, e.g., cpg.method creates a traversal which you can add more steps to. You can, for example, evaluate the traversal by converting it to a list:

    cpg.method.toList

    Since toList is such a common operation, we provide the shorthand l, meaning that

    cpg.method.l

    provides the same result as the former query.

    Properties

    Nodes have "properties", key-value pairs where keys are strings and values are primitive data types such as strings, integers, or Booleans. Properties of nodes can be selected based on their key, e.g.,

    cpg.method.name

    traverses to all method names. Nodes can also be filtered based on properties, e.g.,

    cpg.method.name(".*exec.*")

    traverse to all methods where name matches the regular expression ".*exec.*". You can see a complete list of properties by browsing to the API documentation of the corresponding step. For example, you can find the properties of method nodes at io.shiftleft.semanticcpg.language.types.structure.MethodTraversal.

    Side effects

    Useful if you want to mutate something outside the traversal, or simply debug it: This prints all typeDecl names as it traverses the graph and increments i for each one.

    var i = 0
    cpg.typeDecl.sideEffect{typeTemplate => println(typeTemplate.name); i = i + 1}.exec

    [advanced] Selecting multiple things from your traversal

    If you are interested in multiple things along the way of your traversal, you label anything using the as modulator, and use select at the end. Note that the compiler automatically derived the correct return type as a tuple of the labelled steps, in this case with two elements.

    cpg.method.as("method").definingTypeDecl.as("classDef").select.toList
    // return type: List[(Method, TypeDecl)]

    [advanced] For comprehensions

    You can always start a new traversal from a node, e.g.,

    val someMethod = cpg.method.head
    someMethod.start.parameter.toList

    You can use this e.g. in a for comprehension, which is (in this context) essentially an alternative way to select multiple intermediate things. It is more expressive, but more computationally expensive.

    val query = for {
      method <- cpg.method
      param <- method.start.parameter
    } yield (method.name, param.name)
    
    query.toList
    Definition Classes
    shiftleft
  • package passes
    Definition Classes
    semanticcpg
  • package cfgcreation
    Definition Classes
    passes
  • Cfg
  • CfgCreator
  • CfgEdge

class CfgCreator extends AnyRef

Translation of abstract syntax trees into control flow graphs

The problem of translating an abstract syntax tree into a corresponding control flow graph can be formulated as a recursive problem in which sub trees of the syntax tree are translated and their corresponding control flow graphs are connected according to the control flow semantics of the root node. For example, consider the abstract syntax tree for an if-statement:

( if ) / \ (x < 10) (x += 1) / \ / \ x 10 x 1

This tree can be translated into a control flow graph, by translating the sub tree rooted in x < 10 and that of x += 1 and connecting their control flow graphs according to the semantics of if:

[x < 10]---- |t f| [x +=1 ] | | The semantics of if dictate that the first sub tree to the left is a condition, which is connected to the CFG of the second sub tree - the body of the if statement - via a control flow edge with the true label (indicated in the illustration by t), and to the CFG of any follow-up code via a false edge (indicated by f).

A problem that becomes immediately apparent in the illustration is that the result of translating a sub tree may leave us with edges for which a source node is known but the destination node depends on parents or siblings that were not considered in the translation. For example, we know that an outgoing edge from [x<10] must exist, but we do not yet know where it should lead. We refer to the set of nodes of the control flow graph with outgoing edges for which the destination node is yet to be determined as the "fringe" of the control flow graph.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. CfgCreator
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Instance Constructors

  1. new CfgCreator(entryNode: Method)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def cfgFor(node: AstNode): Cfg

    This method dispatches AST nodes by type and calls corresponding conversion methods.

    This method dispatches AST nodes by type and calls corresponding conversion methods.

    Attributes
    protected
  6. def cfgForAndExpression(call: Call): Cfg

    The right hand side of a logical AND expression is only evaluated if the left hand side is true as the entire expression can only be true if both expressions are true.

    The right hand side of a logical AND expression is only evaluated if the left hand side is true as the entire expression can only be true if both expressions are true. This is encoded in the corresponding control flow graph by creating control flow graphs for the left and right hand expressions and appending the two, where the fringe edge type of the left CFG is TrueEdge.

    Attributes
    protected
  7. def cfgForBreakStatement(node: ControlStructure): Cfg

    The CFG for a break/continue statements contains only the break/continue statement as a single entry node.

    The CFG for a break/continue statements contains only the break/continue statement as a single entry node. The fringe is empty, that is, appending another CFG to the break statement will not result in the creation of an edge from the break statement to the entry point of the other CFG.

    Attributes
    protected
  8. def cfgForConditionalExpression(call: Call): Cfg

    A conditional expression is of the form condition ? trueExpr ; falseExpr We create the corresponding CFGs by creating CFGs for the three expressions and adding edges between them.

    A conditional expression is of the form condition ? trueExpr ; falseExpr We create the corresponding CFGs by creating CFGs for the three expressions and adding edges between them. The new entry node is the condition entry node.

    Attributes
    protected
  9. def cfgForContinueStatement(node: ControlStructure): Cfg
    Attributes
    protected
  10. def cfgForControlStructure(node: ControlStructure): Cfg

    A second layer of dispatching for control structures.

    A second layer of dispatching for control structures. This could as well be part of cfgFor and has only been placed into a separate function to increase readability.

    Attributes
    protected
  11. def cfgForDoStatement(node: ControlStructure): Cfg

    A Do-Statement is of the form do body while(condition) where body may be empty.

    A Do-Statement is of the form do body while(condition) where body may be empty. We again first calculate the inner CFG as bodyCfg ++ conditionCfg and then connect edges according to the semantics of do-while.

    Attributes
    protected
  12. def cfgForForStatement(node: ControlStructure): Cfg

    A for statement is of the form for(initExpr; condition; loopExpr) body and all four components may be empty.

    A for statement is of the form for(initExpr; condition; loopExpr) body and all four components may be empty. The sequence (condition - body - loopExpr) form the inner part of the loop and we calculate the corresponding CFG innerCfg so that it is no longer relevant which of these three actually exist and we still have an entry node for the loop and a fringe.

    Attributes
    protected
  13. def cfgForGotoStatement(node: ControlStructure): Cfg

    A CFG for a goto statement is one containing the goto node as an entry node and an empty fringe.

    A CFG for a goto statement is one containing the goto node as an entry node and an empty fringe. Moreover, we store the goto for dispatching with withResolvedGotos once the CFG for the entire method has been calculated.

    Attributes
    protected
  14. def cfgForIfStatement(node: ControlStructure): Cfg

    CFG creation for if statements of the form if(condition) body, optionally followed by else body2.

    CFG creation for if statements of the form if(condition) body, optionally followed by else body2.

    Attributes
    protected
  15. def cfgForJumpTarget(n: JumpTarget): Cfg

    Jump targets ("labels") are included in the CFG.

    Jump targets ("labels") are included in the CFG. As these should be connected to the next appended CFG, we specify that the label node is both the entry node and the only node in the fringe. This is achieved by calling cfgForSingleNode on the label node. Just like for breaks and continues, we record labels. We store case/default labels separately from other labels, but that is not a relevant implementation detail.

    Attributes
    protected
  16. def cfgForOrExpression(call: Call): Cfg

    Same construction recipe as for the AND expression, just that the fringe edge type of the left CFG is FalseEdge.

    Same construction recipe as for the AND expression, just that the fringe edge type of the left CFG is FalseEdge.

    Attributes
    protected
  17. def cfgForReturn(actualRet: Return): Cfg

    Return statements may contain expressions as return values, and therefore, the CFG for a return statement consists of the CFG for calculation of that expression, appended to a CFG containing only the return node, connected with a single edge to the method exit node.

    Return statements may contain expressions as return values, and therefore, the CFG for a return statement consists of the CFG for calculation of that expression, appended to a CFG containing only the return node, connected with a single edge to the method exit node. The fringe is empty.

    Attributes
    protected
  18. def cfgForSwitchStatement(node: ControlStructure): Cfg

    CFG creation for switch statements of the form switch { case condition: ... }.

    CFG creation for switch statements of the form switch { case condition: ... }.

    Attributes
    protected
  19. def cfgForTryStatement(node: ControlStructure): Cfg

    CFG creation for try statements of the form try { tryBody ] catch { catchBody } , optionally followed by finally { finallyBody }.

    CFG creation for try statements of the form try { tryBody ] catch { catchBody } , optionally followed by finally { finallyBody }.

    Attributes
    protected
  20. def cfgForWhileStatement(node: ControlStructure): Cfg

    CFG creation for while statements of the form while(condition) body1 else body2 where body1 and the else block are optional.

    CFG creation for while statements of the form while(condition) body1 else body2 where body1 and the else block are optional.

    Attributes
    protected
  21. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native() @HotSpotIntrinsicCandidate()
  22. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  23. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  24. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  25. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  26. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  27. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  28. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  29. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  30. def run(): Iterator[DiffGraph]

    We return the CFG as a sequence of Diff Graphs that is calculated by first obtaining the CFG for the method and then resolving gotos.

  31. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  32. def toString(): String
    Definition Classes
    AnyRef → Any
  33. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  34. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()
  35. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable]) @Deprecated
    Deprecated

Inherited from AnyRef

Inherited from Any

Ungrouped