Package opennlp.tools.tokenize
Class TokenizerModel
- java.lang.Object
-
- opennlp.tools.util.model.BaseModel
-
- opennlp.tools.tokenize.TokenizerModel
-
- All Implemented Interfaces:
ArtifactProvider
public final class TokenizerModel extends BaseModel
TheTokenizerModelis the model used by a learnableTokenizer.- See Also:
TokenizerME
-
-
Field Summary
-
Fields inherited from class opennlp.tools.util.model.BaseModel
TRAINING_CUTOFF_PROPERTY, TRAINING_EVENTHASH_PROPERTY, TRAINING_ITERATIONS_PROPERTY
-
-
Constructor Summary
Constructors Constructor Description TokenizerModel(java.io.File modelFile)Initializes the current instance.TokenizerModel(java.io.InputStream in)Initializes the current instance.TokenizerModel(java.lang.String language, AbstractModel tokenizerMaxentModel, boolean useAlphaNumericOptimization)Deprecated.UseTokenizerModel(MaxentModel, Map, TokenizerFactory)instead and pass in aTokenizerFactory.TokenizerModel(java.lang.String language, AbstractModel tokenizerMaxentModel, boolean useAlphaNumericOptimization, java.util.Map<java.lang.String,java.lang.String> manifestInfoEntries)Deprecated.UseTokenizerModel(MaxentModel, Map, TokenizerFactory)instead and pass in aTokenizerFactory.TokenizerModel(java.lang.String language, MaxentModel tokenizerMaxentModel, Dictionary abbreviations, boolean useAlphaNumericOptimization, java.util.Map<java.lang.String,java.lang.String> manifestInfoEntries)Deprecated.UseTokenizerModel(MaxentModel, Map, TokenizerFactory)instead and pass in aTokenizerFactory.TokenizerModel(java.net.URL modelURL)Initializes the current instance.TokenizerModel(MaxentModel tokenizerModel, java.util.Map<java.lang.String,java.lang.String> manifestInfoEntries, TokenizerFactory tokenizerFactory)Initializes the current instance.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description DictionarygetAbbreviations()TokenizerFactorygetFactory()MaxentModelgetMaxentModel()static voidmain(java.lang.String[] args)booleanuseAlphaNumericOptimization()-
Methods inherited from class opennlp.tools.util.model.BaseModel
getArtifact, getLanguage, getManifestProperty, getVersion, isLoadedFromSerialized, serialize
-
-
-
-
Constructor Detail
-
TokenizerModel
public TokenizerModel(MaxentModel tokenizerModel, java.util.Map<java.lang.String,java.lang.String> manifestInfoEntries, TokenizerFactory tokenizerFactory)
Initializes the current instance.- Parameters:
tokenizerModel- the modelmanifestInfoEntries- the manifesttokenizerFactory- the factory
-
TokenizerModel
public TokenizerModel(java.lang.String language, MaxentModel tokenizerMaxentModel, Dictionary abbreviations, boolean useAlphaNumericOptimization, java.util.Map<java.lang.String,java.lang.String> manifestInfoEntries)Deprecated.UseTokenizerModel(MaxentModel, Map, TokenizerFactory)instead and pass in aTokenizerFactory.Initializes the current instance.- Parameters:
language- the language the tokenizer should usetokenizerMaxentModel- the statistical model of the tokenizerabbreviations- the dictionary containing the abbreviationsuseAlphaNumericOptimization- if true alpha numeric optimization is enabled, otherwise notmanifestInfoEntries- the additional meta data which should be written into manifest
-
TokenizerModel
public TokenizerModel(java.lang.String language, AbstractModel tokenizerMaxentModel, boolean useAlphaNumericOptimization, java.util.Map<java.lang.String,java.lang.String> manifestInfoEntries)Deprecated.UseTokenizerModel(MaxentModel, Map, TokenizerFactory)instead and pass in aTokenizerFactory.Initializes the current instance.- Parameters:
language- the language the tokenizer should usetokenizerMaxentModel- the statistical model of the tokenizeruseAlphaNumericOptimization- if true alpha numeric optimization is enabled, otherwise notmanifestInfoEntries- the additional meta data which should be written into manifest
-
TokenizerModel
public TokenizerModel(java.lang.String language, AbstractModel tokenizerMaxentModel, boolean useAlphaNumericOptimization)Deprecated.UseTokenizerModel(MaxentModel, Map, TokenizerFactory)instead and pass in aTokenizerFactory.Initializes the current instance.- Parameters:
language- the language the tokenizer should usetokenizerMaxentModel- the statistical model of the tokenizeruseAlphaNumericOptimization- if true alpha numeric optimization is enabled, otherwise not
-
TokenizerModel
public TokenizerModel(java.io.InputStream in) throws java.io.IOException, InvalidFormatExceptionInitializes the current instance.- Parameters:
in- the Input Stream to load the model from- Throws:
java.io.IOException- if reading from the stream fails in anywayInvalidFormatException- if the stream doesn't have the expected format
-
TokenizerModel
public TokenizerModel(java.io.File modelFile) throws java.io.IOException, InvalidFormatExceptionInitializes the current instance.- Parameters:
modelFile- the file containing the tokenizer model- Throws:
java.io.IOException- if reading from the stream fails in anywayInvalidFormatException- if the stream doesn't have the expected format
-
TokenizerModel
public TokenizerModel(java.net.URL modelURL) throws java.io.IOException, InvalidFormatExceptionInitializes the current instance.- Parameters:
modelURL- the URL pointing to the tokenizer model- Throws:
java.io.IOException- if reading from the stream fails in anywayInvalidFormatException- if the stream doesn't have the expected format
-
-
Method Detail
-
getFactory
public TokenizerFactory getFactory()
-
getMaxentModel
public MaxentModel getMaxentModel()
-
getAbbreviations
public Dictionary getAbbreviations()
-
useAlphaNumericOptimization
public boolean useAlphaNumericOptimization()
-
main
public static void main(java.lang.String[] args) throws java.io.IOException- Throws:
java.io.IOException
-
-