Class FSTBuilder
- java.lang.Object
-
- org.apache.pinot.segment.local.utils.nativefst.builder.FSTBuilder
-
public final class FSTBuilder extends Object
Fast, memory-conservative finite state transducer builder, returning an in-memoryFSTthat is a tradeoff between construction speed and memory consumption. Use serializers to compress the returned automaton into more compact form.- See Also:
FSTSerializer
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classFSTBuilder.InfoEntryDebug and information constants.
-
Field Summary
Fields Modifier and Type Field Description static Comparator<byte[]>LEXICAL_ORDERINGA comparator comparing full byte arrays.
-
Constructor Summary
Constructors Constructor Description FSTBuilder()FSTBuilder(int bufferGrowthSize)
-
Method Summary
Modifier and Type Method Description voidadd(byte[] sequence, int start, int len, int outputSymbol)Add a single sequence of bytes to the FST.static FSTbuild(byte[][] input, int[] outputSymbols)Build a minimal, deterministic automaton from a sorted list of byte sequences.static FSTbuild(Iterable<byte[]> input, int[] outputSymbols)Build a minimal, deterministic automaton from an iterable list of byte sequences.static FSTbuildFST(SortedMap<String,Integer> input)FSTcomplete()
-
-
-
Field Detail
-
LEXICAL_ORDERING
public static final Comparator<byte[]> LEXICAL_ORDERING
A comparator comparing full byte arrays. Unsigned byte comparisons ('C'-locale).
-
-
Method Detail
-
build
public static FST build(byte[][] input, int[] outputSymbols)
Build a minimal, deterministic automaton from a sorted list of byte sequences.- Parameters:
input- Input sequences to build automaton from.- Returns:
- Returns the automaton encoding all input sequences.
-
build
public static FST build(Iterable<byte[]> input, int[] outputSymbols)
Build a minimal, deterministic automaton from an iterable list of byte sequences.- Parameters:
input- Input sequences to build automaton from.- Returns:
- Returns the automaton encoding all input sequences.
-
add
public void add(byte[] sequence, int start, int len, int outputSymbol)Add a single sequence of bytes to the FST. The input must be lexicographically greater than any previously added sequence.- Parameters:
sequence- The array holding input sequence of bytes.start- Starting offset (inclusive)len- Length of the input sequence (at least 1 byte).
-
complete
public FST complete()
- Returns:
- Finalizes the construction of the automaton and returns it.
-
-