Class RegexpMatcher
- java.lang.Object
-
- org.apache.pinot.segment.local.utils.nativefst.utils.RegexpMatcher
-
public class RegexpMatcher extends Object
RegexpMatcher is a helper to retrieve matching values for a given regexp query. Regexp query is converted into an automaton and we run the matching algorithm on FST. Two main functions of this class are regexMatchOnFST() Function runs matching on FST (See function comments for more details) match(input) Function builds the automaton and matches given input.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description classRegexpMatcher.PathRepresents a path in the FST traversal directed by the automaton
-
Constructor Summary
Constructors Constructor Description RegexpMatcher(String regexQuery, FST fst, org.roaringbitmap.IntConsumer dest)
-
Method Summary
Modifier and Type Method Description booleanmatch(String input)static voidregexMatch(String regexQuery, FST fst, org.roaringbitmap.IntConsumer dest)voidregexMatchOnFST()This function runs matching on automaton built from regexQuery and the FST.
-
-
-
Method Detail
-
regexMatch
public static void regexMatch(String regexQuery, FST fst, org.roaringbitmap.IntConsumer dest)
-
match
public boolean match(String input)
-
regexMatchOnFST
public void regexMatchOnFST()
This function runs matching on automaton built from regexQuery and the FST. FST stores key (string) to a value (Long). Both are state machines and state transition is based on a input character. This algorithm starts with Queue containing (Automaton Start Node, FST Start Node). Each step an entry is popped from the queue: 1) if the automaton state is accept and the FST Node is final (i.e. end node) then the value stored for that FST is added to the set of result. 2) Else next set of transitions on automaton are gathered and for each transition target node for that character is figured out in FST Node, resulting pair of (automaton state, fst node) are added to the queue. 3) This process is bound to complete since we are making progression on the FST (which is a DAG) towards final nodes.
-
-