Class Dictionary

java.lang.Object
opennlp.tools.dictionary.Dictionary
All Implemented Interfaces:
Iterable<StringList>, SerializableArtifact

public class Dictionary extends Object implements Iterable<StringList>, SerializableArtifact
This class is a dictionary.
  • Constructor Details

  • Method Details

    • put

      public void put(StringList tokens)
      Adds the tokens to the dictionary as one new entry.
      Parameters:
      tokens - the new entry
    • getMinTokenCount

      public int getMinTokenCount()
      Returns:
      minimum token count in the dictionary
    • getMaxTokenCount

      public int getMaxTokenCount()
      Returns:
      maximum token count in the dictionary
    • contains

      public boolean contains(StringList tokens)
      Checks if this dictionary has the given entry.
      Parameters:
      tokens - query
      Returns:
      true if it contains the entry otherwise false
    • remove

      public void remove(StringList tokens)
      Removes the given tokens form the current instance.
      Parameters:
      tokens - filter tokens
    • iterator

      public Iterator<StringList> iterator()
      Retrieves an Iterator over all tokens.
      Specified by:
      iterator in interface Iterable<StringList>
      Returns:
      token-Iterator
    • size

      public int size()
      Retrieves the number of tokens in the current instance.
      Returns:
      number of tokens
    • serialize

      public void serialize(OutputStream out) throws IOException
      Writes the current instance to the given OutputStream.
      Parameters:
      out - OutputStream
      Throws:
      IOException
    • equals

      public boolean equals(Object obj)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • parseOneEntryPerLine

      public static Dictionary parseOneEntryPerLine(Reader in) throws IOException
      Reads a dictionary which has one entry per line. The tokens inside an entry are whitespace delimited.
      Parameters:
      in - Reader
      Returns:
      the parsed dictionary
      Throws:
      IOException
    • asStringSet

      public Set<String> asStringSet()
      Gets this dictionary as a Set<String>. Only iterator(), size() and contains(Object) methods are implemented. If this dictionary entries are multi tokens only the first token of the entry will be part of the Set.
      Returns:
      a Set containing the entries of this dictionary
    • getArtifactSerializerClass

      public Class<?> getArtifactSerializerClass()
      Gets the Serializer Class for Dictionary
      Specified by:
      getArtifactSerializerClass in interface SerializableArtifact
      Returns:
      DictionarySerializer