Class NamedEntityParser

  • All Implemented Interfaces:
    Serializable, org.apache.tika.parser.Parser

    public class NamedEntityParser
    extends org.apache.tika.parser.AbstractParser
    This implementation of Parser extracts entity names from text content and adds it to the metadata.

    All the metadata keys will have a common suffix "NER_"

    The Named Entity recogniser implementation can be changed by setting the system property "ner.impl.class" value to a name of class that implements NERecogniser contract

    See Also:
    OpenNLPNERecogniser, NERecogniser, Serialized Form
    • Field Detail

      • LOG

        public static final org.slf4j.Logger LOG
      • MEDIA_TYPES

        public static final Set<org.apache.tika.mime.MediaType> MEDIA_TYPES
      • DEFAULT_NER_IMPL

        public static final String DEFAULT_NER_IMPL
      • secondaryParser

        public org.apache.tika.Tika secondaryParser
    • Constructor Detail

      • NamedEntityParser

        public NamedEntityParser()
    • Method Detail

      • getSupportedTypes

        public Set<org.apache.tika.mime.MediaType> getSupportedTypes​(org.apache.tika.parser.ParseContext parseContext)