Interface HtmlParser


public interface HtmlParser
The HTML parser is a service to parse HTML and generate SAX events or a Document out of the HTML.
  • Method Details

    • parse

      void parse(InputStream inputStream, String encoding, ContentHandler contentHandler) throws SAXException
      Parse HTML and send SAX events.
      Parameters:
      inputStream - The input stream
      encoding - Encoding of the input stream, null for default encoding.
      contentHandler - Content handler receiving the SAX events. The content handler might also implement the lexical handler interface.
      Throws:
      SAXException - Exception thrown when parsing fails.
    • parse

      Document parse(String systemId, InputStream inputStream, String encoding) throws IOException
      Parse HTML and return a DOM Document.
      Parameters:
      systemId - The system id
      inputStream - The input stream
      encoding - Encoding of the input stream, null for default encoding.
      Returns:
      A DOM Document built from parsed HTML or null
      Throws:
      IOException - Exception thrown when parsing fails.