Package opennlp.tools.postag
Class POSTaggerCrossValidator
java.lang.Object
opennlp.tools.postag.POSTaggerCrossValidator
-
Constructor Summary
ConstructorsConstructorDescriptionPOSTaggerCrossValidator(String languageCode, TrainingParameters trainParam, File tagDictionary, byte[] featureGeneratorBytes, Map<String, Object> resources, Integer tagdicCutoff, String factoryClass, POSTaggerEvaluationMonitor... listeners) Creates aPOSTaggerCrossValidatorthat builds a ngram dictionary dynamically.POSTaggerCrossValidator(String languageCode, TrainingParameters trainParam, POSTaggerFactory factory, POSTaggerEvaluationMonitor... listeners) Creates aPOSTaggerCrossValidatorusing the givenPOSTaggerFactory. -
Method Summary
Modifier and TypeMethodDescriptionvoidevaluate(ObjectStream<POSSample> samples, int nFolds) Starts the evaluation.doubleRetrieves the accuracy for all iterations.longRetrieves the number of words which where validated over all iterations.
-
Constructor Details
-
POSTaggerCrossValidator
public POSTaggerCrossValidator(String languageCode, TrainingParameters trainParam, File tagDictionary, byte[] featureGeneratorBytes, Map<String, Object> resources, Integer tagdicCutoff, String factoryClass, POSTaggerEvaluationMonitor... listeners) Creates aPOSTaggerCrossValidatorthat builds a ngram dictionary dynamically. It instantiates a sub-class ofPOSTaggerFactoryusing the tag and the ngram dictionaries. -
POSTaggerCrossValidator
public POSTaggerCrossValidator(String languageCode, TrainingParameters trainParam, POSTaggerFactory factory, POSTaggerEvaluationMonitor... listeners) Creates aPOSTaggerCrossValidatorusing the givenPOSTaggerFactory.
-
-
Method Details
-
evaluate
Starts the evaluation.- Parameters:
samples- the data to train and testnFolds- number of folds- Throws:
IOException
-
getWordAccuracy
public double getWordAccuracy()Retrieves the accuracy for all iterations.- Returns:
- the word accuracy
-
getWordCount
public long getWordCount()Retrieves the number of words which where validated over all iterations. The result is the amount of folds multiplied by the total number of words.- Returns:
- the word count
-