Class CTAKESContentHandler

  • All Implemented Interfaces:
    ContentHandler, DTDHandler, EntityResolver, ErrorHandler

    public class CTAKESContentHandler
    extends org.apache.tika.sax.ContentHandlerDecorator
    Class used to extract biomedical information while parsing.

    This class relies on Apache cTAKES that is a natural language processing system for extraction of information from electronic medical record clinical free-text.

    • Field Detail

      • CTAKES_META_PREFIX

        public static String CTAKES_META_PREFIX
    • Constructor Detail

      • CTAKESContentHandler

        public CTAKESContentHandler​(ContentHandler handler,
                                    org.apache.tika.metadata.Metadata metadata,
                                    CTAKESConfig config)
        Creates a new CTAKESContentHandler for the given ContentHandler and Metadata objects.
        Parameters:
        handler - the ContentHandler object to be decorated.
        metadata - the Metadata object that will be populated using biomedical information extracted by cTAKES.
        config - the CTAKESConfig object used to configure the handler.
      • CTAKESContentHandler

        public CTAKESContentHandler​(ContentHandler handler,
                                    org.apache.tika.metadata.Metadata metadata)
        Creates a new CTAKESContentHandler for the given ContentHandler and Metadata objects.
        Parameters:
        handler - the ContentHandler object to be decorated.
        metadata - the Metadata object that will be populated using biomedical information extracted by cTAKES.
      • CTAKESContentHandler

        public CTAKESContentHandler()
        Default constructor.
    • Method Detail

      • characters

        public void characters​(char[] ch,
                               int start,
                               int length)
                        throws SAXException
        Specified by:
        characters in interface ContentHandler
        Overrides:
        characters in class org.apache.tika.sax.ContentHandlerDecorator
        Throws:
        SAXException
      • getMetadata

        public org.apache.tika.metadata.Metadata getMetadata()
        Returns metadata that includes cTAKES annotations.
        Returns:
        Metadata object that includes cTAKES annotations.