Package org.apache.tika.parser.ner
Class NamedEntityParser
- java.lang.Object
-
- org.apache.tika.parser.AbstractParser
-
- org.apache.tika.parser.ner.NamedEntityParser
-
- All Implemented Interfaces:
Serializable,org.apache.tika.parser.Parser
public class NamedEntityParser extends org.apache.tika.parser.AbstractParserThis implementation ofParserextracts entity names from text content and adds it to the metadata.All the metadata keys will have a common suffix "NER_"
The Named Entity recogniser implementation can be changed by setting the system property "ner.impl.class" value to a name of class that implements
NERecognisercontract- See Also:
OpenNLPNERecogniser,NERecogniser, Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static StringDEFAULT_NER_IMPLstatic org.slf4j.LoggerLOGstatic StringMD_KEY_PREFIXstatic Set<org.apache.tika.mime.MediaType>MEDIA_TYPESorg.apache.tika.TikasecondaryParserstatic StringSYS_PROP_NER_IMPL
-
Constructor Summary
Constructors Constructor Description NamedEntityParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Set<org.apache.tika.mime.MediaType>getSupportedTypes(org.apache.tika.parser.ParseContext parseContext)voidparse(InputStream inputStream, ContentHandler contentHandler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext parseContext)
-
-
-
Field Detail
-
LOG
public static final org.slf4j.Logger LOG
-
MEDIA_TYPES
public static final Set<org.apache.tika.mime.MediaType> MEDIA_TYPES
-
MD_KEY_PREFIX
public static final String MD_KEY_PREFIX
- See Also:
- Constant Field Values
-
DEFAULT_NER_IMPL
public static final String DEFAULT_NER_IMPL
-
SYS_PROP_NER_IMPL
public static final String SYS_PROP_NER_IMPL
- See Also:
- Constant Field Values
-
secondaryParser
public org.apache.tika.Tika secondaryParser
-
-
Method Detail
-
getSupportedTypes
public Set<org.apache.tika.mime.MediaType> getSupportedTypes(org.apache.tika.parser.ParseContext parseContext)
-
parse
public void parse(InputStream inputStream, ContentHandler contentHandler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext parseContext) throws IOException, SAXException, org.apache.tika.exception.TikaException
- Throws:
IOExceptionSAXExceptionorg.apache.tika.exception.TikaException
-
-