public abstract class Lexicon extends Object
Lexicon is a collection of
WordElement objects; it does not do any
morphological processing (as was the case in simplenlg V3). Information about
WordElement can be obtained from a database (
NIHDBLexicon) or from an XML file (
XMLLexicon). Simplenlg V4 comes with a default
(XML) lexicon, which is retrieved by the getDefaultLexicon
method.
There are several ways of retrieving words. If in doubt, use
lookupWord. More control is available from the
getXXXX methods, which allow words to retrieved in several ways
LexicalCategory; for example
"university" and Noun
LexicalCategory; for example
"universities" and Noun
For each type of lookup, there are three methods
getWords: get all matching
WordElement in the Lexicon. For example,
getWords("dog") would return a List of two
WordElement, one for the noun "dog" and one for the verb "dog".
If there are no matching entries in the lexicon, this method returns an empty
collection
getWord: get a single matching
WordElement in the Lexicon. For example,
getWord("dog") would a for either the noun "dog" or the
verb "dog" (unpredictable). If there are no matching entries in
the lexicon, this method will create a default WordElement based
on the information specified.
-
hasWord: returns true if the Lexicon contains
at least one matching WordElement
| Constructor and Description |
|---|
Lexicon() |
| Modifier and Type | Method and Description |
|---|---|
void |
close()
close the lexicon (if necessary) if lexicon does not need to be closed,
this does nothing
|
protected WordElement |
createWord(String baseForm)
create a default WordElement.
|
protected WordElement |
createWord(String baseForm,
LexicalCategory category)
create a default WordElement.
|
static Lexicon |
getDefaultLexicon()
returns the default built-in lexicon
|
WordElement |
getWord(String baseForm)
get a WordElement which has the specified base form (of any category)
|
WordElement |
getWord(String baseForm,
LexicalCategory category)
get a WordElement which has the specified base form and category
|
WordElement |
getWordByID(String id)
get a WordElement with the specified ID
|
WordElement |
getWordFromVariant(String variant)
returns a WordElement which has the specified inflected form and/or
spelling variant that matches the specified variant, of any category.
|
WordElement |
getWordFromVariant(String variant,
LexicalCategory category)
returns a WordElement which has the specified inflected form and/or
spelling variant that matches the specified variant, of the specified
category
|
List<WordElement> |
getWords(String baseForm)
returns all Words which have the specified base form
|
abstract List<WordElement> |
getWords(String baseForm,
LexicalCategory category)
returns all Words which have the specified base form and category
|
abstract List<WordElement> |
getWordsByID(String id)
returns a List of WordElement which have this ID.
|
List<WordElement> |
getWordsFromVariant(String variant)
returns Words which have an inflected form and/or spelling variant that
matches the specified variant, of any category.
|
abstract List<WordElement> |
getWordsFromVariant(String variant,
LexicalCategory category)
returns Words which have an inflected form and/or spelling variant that
matches the specified variant, and are in the specified category.
|
boolean |
hasWord(String baseForm)
return
true if the lexicon contains a WordElement which has
the specified base form (in any category) |
boolean |
hasWord(String baseForm,
LexicalCategory category)
return
true if the lexicon contains a WordElement which has
the specified base form and category |
boolean |
hasWordByID(String id)
return
true if the lexicon contains a WordElement which the
specified ID |
boolean |
hasWordFromVariant(String variant)
return
true if the lexicon contains a WordElement which
matches the specified variant form (in any category) |
boolean |
hasWordFromVariant(String variant,
LexicalCategory category)
return
true if the lexicon contains a WordElement which
matches the specified variant form and category |
WordElement |
lookupWord(String baseForm)
General word lookup method, tries base form, variant, ID (in this order)
Creates new word if can't find existing word
|
WordElement |
lookupWord(String baseForm,
LexicalCategory category)
General word lookup method, tries base form, variant, ID (in this order)
Creates new word if can't find existing word
|
public static Lexicon getDefaultLexicon()
protected WordElement createWord(String baseForm, LexicalCategory category)
baseForm - - base form of wordcategory - - category of wordprotected WordElement createWord(String baseForm)
baseForm - - base form of wordpublic WordElement lookupWord(String baseForm, LexicalCategory category)
public WordElement lookupWord(String baseForm)
public abstract List<WordElement> getWords(String baseForm, LexicalCategory category)
baseForm - - base form of word, eg "be" or "dog" (not "is" or "dogs")category - - syntactic category of word (ANY for unknown)public WordElement getWord(String baseForm, LexicalCategory category)
baseForm - - base form of word, eg "be" or "dog" (not "is" or "dogs")category - - syntactic category of word (ANY for unknown)public boolean hasWord(String baseForm, LexicalCategory category)
true if the lexicon contains a WordElement which has
the specified base form and categorybaseForm - - base form of word, eg "be" or "dog" (not "is" or "dogs")category - - syntactic category of word (ANY for unknown)true if Lexicon contains such a WordElementpublic List<WordElement> getWords(String baseForm)
baseForm - - base form of word, eg "be" or "dog" (not "is" or "dogs")public WordElement getWord(String baseForm)
baseForm - - base form of word, eg "be" or "dog" (not "is" or "dogs")public boolean hasWord(String baseForm)
true if the lexicon contains a WordElement which has
the specified base form (in any category)baseForm - - base form of word, eg "be" or "dog" (not "is" or "dogs")true if Lexicon contains such a WordElementpublic abstract List<WordElement> getWordsByID(String id)
id - - internal lexicon ID for a wordpublic WordElement getWordByID(String id)
id - internal lexicon ID for a wordpublic boolean hasWordByID(String id)
true if the lexicon contains a WordElement which the
specified IDid - - internal lexicon ID for a wordtrue if Lexicon contains such a WordElementpublic abstract List<WordElement> getWordsFromVariant(String variant, LexicalCategory category)
variant - - base form, inflected form, or spelling variant of wordcategory - - syntactic category of word (ANY for unknown)public WordElement getWordFromVariant(String variant, LexicalCategory category)
variant - - base form, inflected form, or spelling variant of wordcategory - - syntactic category of word (ANY for unknown)public boolean hasWordFromVariant(String variant, LexicalCategory category)
true if the lexicon contains a WordElement which
matches the specified variant form and categoryvariant - - base form, inflected form, or spelling variant of wordcategory - - syntactic category of word (ANY for unknown)true if Lexicon contains such a WordElementpublic List<WordElement> getWordsFromVariant(String variant)
variant - - base form, inflected form, or spelling variant of wordpublic WordElement getWordFromVariant(String variant)
variant - - base form, inflected form, or spelling variant of wordpublic boolean hasWordFromVariant(String variant)
true if the lexicon contains a WordElement which
matches the specified variant form (in any category)variant - - base form, inflected form, or spelling variant of wordtrue if Lexicon contains such a WordElementpublic void close()
Copyright © 2020. All Rights Reserved.