java.lang.Object
org.sejda.sambox.contentstream.PDFStreamEngine
org.sejda.sambox.text.PDFTextStreamEngine
- Direct Known Subclasses:
PDFMarkedContentExtractor,PDFTextStripper
PDFStreamEngine subclass for advanced processing of text via TextPosition.
- Author:
- Ben Litchfield, John Hewson
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected floatcomputeFontHeight(PDFont font) Compute the font height.voidprocessPage(PDPage page) This will initialise and process the contents of the stream.protected voidA method provided as an event interface to allow a subclass to perform some specific functionality when text needs to be processed.protected voidCalled when a glyph is to be processed.Methods inherited from class org.sejda.sambox.contentstream.PDFStreamEngine
addOperator, addOperatorIfAbsent, applyTextAdjustment, beginMarkedContentSequence, beginText, decreaseLevel, endMarkedContentSequence, endText, getAppearance, getCurrentPage, getGraphicsStackSize, getGraphicsState, getInitialMatrix, getLevel, getResources, getTextLineMatrix, getTextMatrix, increaseLevel, operatorException, processAnnotation, processChildStream, processOperator, processOperator, processSoftMask, processStream, processTilingPattern, processTilingPattern, processTransparencyGroup, processType3Stream, restoreGraphicsStack, restoreGraphicsState, saveGraphicsStack, saveGraphicsState, setLineDashPattern, setTextLineMatrix, setTextMatrix, showAnnotation, showFontGlyph, showForm, showText, showTextString, showTextStrings, showTransparencyGroup, showType3Glyph, transformedPoint, transformWidth, unsupportedOperator
-
Constructor Details
-
PDFTextStreamEngine
Constructor.- Throws:
IOException
-
-
Method Details
-
processPage
This will initialise and process the contents of the stream.- Overrides:
processPagein classPDFStreamEngine- Parameters:
page- the page to process- Throws:
IOException- if there is an error accessing the stream.
-
showGlyph
protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement) throws IOException Called when a glyph is to be processed. The heuristic calculations here were originally written by Ben Litchfield for PDFStreamEngine.- Overrides:
showGlyphin classPDFStreamEngine- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
computeFontHeight
Compute the font height. Override this if you want to use own calculations.- Parameters:
font- the font.- Returns:
- the font height.
- Throws:
IOException- if there is an error while getting the font bounding box.
-
processTextPosition
A method provided as an event interface to allow a subclass to perform some specific functionality when text needs to be processed.- Parameters:
text- The text to be processed.
-