Class PDFHighlighter

    • Constructor Detail

      • PDFHighlighter

        public PDFHighlighter()
                       throws IOException
        Default constructor.
        Throws:
        IOException - If there is an error constructing this class.
    • Method Detail

      • generateXMLHighlight

        public void generateXMLHighlight​(PDDocument pdDocument,
                                         String highlightWord,
                                         Writer xmlOutput)
                                  throws IOException
        Generate an XML highlight string based on the PDF.
        Parameters:
        pdDocument - The PDF to find words in.
        highlightWord - The word to search for.
        xmlOutput - The resulting output xml file.
        Throws:
        IOException - If there is an error reading from the PDF, or writing to the XML.
      • generateXMLHighlight

        public void generateXMLHighlight​(PDDocument pdDocument,
                                         String[] sWords,
                                         Writer xmlOutput)
                                  throws IOException
        Generate an XML highlight string based on the PDF.
        Parameters:
        pdDocument - The PDF to find words in.
        sWords - The words to search for.
        xmlOutput - The resulting output xml file.
        Throws:
        IOException - If there is an error reading from the PDF, or writing to the XML.
      • endPage

        protected void endPage​(PDPage pdPage)
                        throws IOException
        End a page. Default implementation is to do nothing. Subclasses may provide additional information.
        Overrides:
        endPage in class PDFTextStripper
        Parameters:
        pdPage - The page we are about to process.
        Throws:
        IOException - If there is any error writing to the stream.
      • main

        public static void main​(String[] args)
                         throws IOException
        Command line application.
        Parameters:
        args - The command line arguments to the application.
        Throws:
        IOException - If there is an error generating the highlight file.
      • showText

        protected void showText​(byte[] string)
                         throws IOException
        Description copied from class: PDFStreamEngine
        Process text from the PDF Stream. You should override this method if you want to perform an action when encoded text is being processed.
        Overrides:
        showText in class PDFStreamEngine
        Parameters:
        string - the encoded text
        Throws:
        IOException - if there is an error processing the string
      • showGlyph

        protected void showGlyph​(Matrix textRenderingMatrix,
                                 PDFont font,
                                 int code,
                                 String unicode,
                                 Vector displacement)
                          throws IOException
        This method was originally written by Ben Litchfield for PDFStreamEngine.
        Overrides:
        showGlyph in class PDFStreamEngine
        Parameters:
        textRenderingMatrix - the current text rendering matrix, Trm
        font - the current font
        code - internal PDF character code for the glyph
        unicode - the Unicode text for this glyph, or null if the PDF does provide it
        displacement - the displacement (i.e. advance) of the glyph in text space
        Throws:
        IOException - if the glyph cannot be processed