Class RegexpMatcher
- java.lang.Object
-
- org.apache.pinot.segment.local.utils.fst.RegexpMatcher
-
public class RegexpMatcher extends Object
RegexpMatcher is a helper to retrieve matching values for a given regexp query. Regexp query is converted into an automaton and we run the matching algorithm on FST. Two main functions of this class are regexMatchOnFST() Function runs matching on FST (See function comments for more details) match(input) Function builds the automaton and matches given input.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classRegexpMatcher.Path<T>
-
Constructor Summary
Constructors Constructor Description RegexpMatcher(String regexQuery, org.apache.lucene.util.fst.FST<Long> fst)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanmatch(String input)static List<Long>regexMatch(String regexQuery, org.apache.lucene.util.fst.FST<Long> fst)List<Long>regexMatchOnFST()This function runs matching on automaton built from regexQuery and the FST.
-
-
-
Method Detail
-
regexMatch
public static List<Long> regexMatch(String regexQuery, org.apache.lucene.util.fst.FST<Long> fst) throws IOException
- Throws:
IOException
-
match
public boolean match(String input)
-
regexMatchOnFST
public List<Long> regexMatchOnFST() throws IOException
This function runs matching on automaton built from regexQuery and the FST. FST stores key (string) to a value (Long). Both are state machines and state transition is based on a input character. This algorithm starts with Queue containing (Automaton Start Node, FST Start Node). Each step an entry is popped from the queue: 1) if the automaton state is accept and the FST Node is final (i.e. end node) then the value stored for that FST is added to the set of result. 2) Else next set of transitions on automaton are gathered and for each transition target node for that character is figured out in FST Node, resulting pair of (automaton state, fst node) are added to the queue. 3) This process is bound to complete since we are making progression on the FST (which is a DAG) towards final nodes.- Returns:
- Throws:
IOException
-
-