Class PDFStreamEngine
- java.lang.Object
-
- com.tom_roush.pdfbox.contentstream.PDFStreamEngine
-
- Direct Known Subclasses:
PDFGraphicsStreamEngine,PDFMarkedContentExtractor,PDFTextStripper
public abstract class PDFStreamEngine extends Object
Processes a PDF content stream and executes certain operations. Provides a callback interface for clients that want to do things with the stream.
-
-
Constructor Summary
Constructors Modifier Constructor Description protectedPDFStreamEngine()Creates a new PDFStreamEngine.
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description voidaddOperator(OperatorProcessor op)Adds an operator processor to the engine.protected voidapplyTextAdjustment(float tx, float ty)Applies a text position adjustment from the TJ operator.voidbeginMarkedContentSequence(COSName tag, COSDictionary properties)Called when a marked content group beginsvoidbeginText()Called when the BT operator is encountered.voiddecreaseLevel()Decrease the level.voidendMarkedContentSequence()Called when a marked content group endsvoidendText()Called when the ET operator is encountered.PDAppearanceStreamgetAppearance(PDAnnotation annotation)Returns the appearance stream to process for the given annotation.PDPagegetCurrentPage()intgetGraphicsStackSize()PDGraphicsStategetGraphicsState()MatrixgetInitialMatrix()Gets the stream's initial matrix.intgetLevel()Get the current level.PDResourcesgetResources()MatrixgetTextLineMatrix()MatrixgetTextMatrix()voidincreaseLevel()Increase the level.protected voidoperatorException(Operator operator, List<COSBase> operands, IOException e)Called when an exception is thrown by an operator.protected voidprocessAnnotation(PDAnnotation annotation, PDAppearanceStream appearance)Process the given annotation with the specified appearance stream.protected voidprocessChildStream(PDContentStream contentStream, PDPage page)Process a child stream of the given page.protected voidprocessOperator(Operator operator, List<COSBase> operands)This is used to handle an operation.voidprocessOperator(String operation, List<COSBase> arguments)This is used to handle an operation.voidprocessPage(PDPage page)This will initialize and process the contents of the stream.protected voidprocessSoftMask(PDTransparencyGroup group)Processes a soft mask transparency group stream.protected voidprocessTilingPattern(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace)Process the given tiling pattern.protected voidprocessTilingPattern(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace, Matrix patternMatrix)Process the given tiling pattern.protected voidprocessTransparencyGroup(PDTransparencyGroup group)Processes a transparency group stream.protected voidprocessType3Stream(PDType3CharProc charProc, Matrix textRenderingMatrix)Processes a Type 3 character stream.voidregisterOperatorProcessor(String operator, OperatorProcessor op)Deprecated.UseaddOperator(OperatorProcessor)insteadprotected voidrestoreGraphicsStack(Deque<PDGraphicsState> snapshot)Restores the entire graphics stack.voidrestoreGraphicsState()Pops the current graphics state from the stack.protected Deque<PDGraphicsState>saveGraphicsStack()Saves the entire graphics stack.voidsaveGraphicsState()Pushes the current graphics state to the stack.voidsetLineDashPattern(COSArray array, int phase)voidsetTextLineMatrix(Matrix value)voidsetTextMatrix(Matrix value)voidshowAnnotation(PDAnnotation annotation)Shows the given annotation.protected voidshowFontGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement)Called when a glyph is to be processed.protected voidshowFontGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement)Deprecated.useshowFontGlyph(Matrix, PDFont, int, Vector)insteadvoidshowForm(PDFormXObject form)Shows a form from the content stream.protected voidshowGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement)Called when a glyph is to be processed.protected voidshowGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement)Deprecated.useshowGlyph(Matrix, PDFont, int, Vector)insteadprotected voidshowText(byte[] string)Process text from the PDF Stream.voidshowTextString(byte[] string)Called when a string of text is to be shown.voidshowTextStrings(COSArray array)Called when a string of text with spacing adjustments is to be shown.voidshowTransparencyGroup(PDTransparencyGroup form)Shows a transparency group from the content stream.protected voidshowType3Glyph(Matrix textRenderingMatrix, PDType3Font font, int code, Vector displacement)Called when a glyph is to be processed.protected voidshowType3Glyph(Matrix textRenderingMatrix, PDType3Font font, int code, String unicode, Vector displacement)Deprecated.useshowType3Glyph(Matrix, PDType3Font, int, Vector)insteadPointFtransformedPoint(float x, float y)Transforms a point using the CTM.protected floattransformWidth(float width)Transforms a width using the CTM.protected voidunsupportedOperator(Operator operator, List<COSBase> operands)Called when an unsupported operator is encountered.
-
-
-
Method Detail
-
registerOperatorProcessor
@Deprecated public void registerOperatorProcessor(String operator, OperatorProcessor op)
Deprecated.UseaddOperator(OperatorProcessor)insteadRegister a custom operator processor with the engine.- Parameters:
operator- The operator as a string.op- Processor instance.
-
addOperator
public final void addOperator(OperatorProcessor op)
Adds an operator processor to the engine.- Parameters:
op- operator processor
-
processPage
public void processPage(PDPage page) throws IOException
This will initialize and process the contents of the stream.- Parameters:
page- the page to process- Throws:
IOException- if there is an error accessing the stream
-
showTransparencyGroup
public void showTransparencyGroup(PDTransparencyGroup form) throws IOException
Shows a transparency group from the content stream.- Parameters:
form- transparency group (form) XObject- Throws:
IOException- if the transparency group cannot be processed
-
showForm
public void showForm(PDFormXObject form) throws IOException
Shows a form from the content stream.- Parameters:
form- form XObject- Throws:
IOException- if the form cannot be processed
-
processSoftMask
protected void processSoftMask(PDTransparencyGroup group) throws IOException
Processes a soft mask transparency group stream.- Parameters:
group- the transparency group.- Throws:
IOException
-
processTransparencyGroup
protected void processTransparencyGroup(PDTransparencyGroup group) throws IOException
Processes a transparency group stream.- Parameters:
group- the transparency group.- Throws:
IOException
-
processType3Stream
protected void processType3Stream(PDType3CharProc charProc, Matrix textRenderingMatrix) throws IOException
Processes a Type 3 character stream.- Parameters:
charProc- Type 3 character proceduretextRenderingMatrix- the Text Rendering Matrix- Throws:
IOException- if there is an error reading or parsing the character content stream.
-
processAnnotation
protected void processAnnotation(PDAnnotation annotation, PDAppearanceStream appearance) throws IOException
Process the given annotation with the specified appearance stream.- Parameters:
annotation- The annotation containing the appearance stream to process.appearance- The appearance stream to process.- Throws:
IOException- If there is an error reading or parsing the appearance content stream.
-
processTilingPattern
protected final void processTilingPattern(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace) throws IOException
Process the given tiling pattern.- Parameters:
tilingPattern- the tiling patterncolor- color to use, if this is an uncoloured pattern, otherwise null.colorSpace- color space to use, if this is an uncoloured pattern, otherwise null.- Throws:
IOException- if there is an error reading or parsing the tiling pattern content stream.
-
processTilingPattern
protected final void processTilingPattern(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace, Matrix patternMatrix) throws IOException
Process the given tiling pattern. Allows the pattern matrix to be overridden for custom rendering.- Parameters:
tilingPattern- the tiling patterncolor- color to use, if this is an uncoloured pattern, otherwise null.colorSpace- color space to use, if this is an uncoloured pattern, otherwise null.patternMatrix- the pattern matrix, may be overridden for custom rendering.- Throws:
IOException- if there is an error reading or parsing the tiling pattern content stream.
-
showAnnotation
public void showAnnotation(PDAnnotation annotation) throws IOException
Shows the given annotation.- Parameters:
annotation- An annotation on the current page.- Throws:
IOException- If an error occurred reading the annotation
-
getAppearance
public PDAppearanceStream getAppearance(PDAnnotation annotation)
Returns the appearance stream to process for the given annotation. May be used to render a specific appearance such as "hover".- Parameters:
annotation- The current annotation.- Returns:
- The stream to process.
-
processChildStream
protected void processChildStream(PDContentStream contentStream, PDPage page) throws IOException
Process a child stream of the given page. Cannot be used withprocessPage(PDPage).- Parameters:
contentStream- the child content streampage- the current page- Throws:
IOException- if there is an exception while processing the stream
-
beginText
public void beginText() throws IOExceptionCalled when the BT operator is encountered. This method is for overriding in subclasses, the default implementation does nothing.- Throws:
IOException- if there was an error processing the text
-
endText
public void endText() throws IOExceptionCalled when the ET operator is encountered. This method is for overriding in subclasses, the default implementation does nothing.- Throws:
IOException- if there was an error processing the text
-
showTextString
public void showTextString(byte[] string) throws IOExceptionCalled when a string of text is to be shown.- Parameters:
string- the encoded text- Throws:
IOException- if there was an error showing the text
-
showTextStrings
public void showTextStrings(COSArray array) throws IOException
Called when a string of text with spacing adjustments is to be shown.- Parameters:
array- array of encoded text strings and adjustments- Throws:
IOException- if there was an error showing the text
-
applyTextAdjustment
protected void applyTextAdjustment(float tx, float ty) throws IOExceptionApplies a text position adjustment from the TJ operator. May be overridden in subclasses.- Parameters:
tx- x-translationty- y-translation- Throws:
IOException- if something went wrong
-
showText
protected void showText(byte[] string) throws IOExceptionProcess text from the PDF Stream. You should override this method if you want to perform an action when encoded text is being processed.- Parameters:
string- the encoded text- Throws:
IOException- if there is an error processing the string
-
showGlyph
protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) throws IOException
Deprecated.useshowGlyph(Matrix, PDFont, int, Vector)insteadCalled when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphunicode- the Unicode text for this glyph, or null if the PDF does provide itdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
showGlyph
protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement) throws IOException
Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
showFontGlyph
protected void showFontGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) throws IOException
Deprecated.useshowFontGlyph(Matrix, PDFont, int, Vector)insteadCalled when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphunicode- the Unicode text for this glyph, or null if the PDF does provide itdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
showFontGlyph
protected void showFontGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement) throws IOException
Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
showType3Glyph
protected void showType3Glyph(Matrix textRenderingMatrix, PDType3Font font, int code, String unicode, Vector displacement) throws IOException
Deprecated.useshowType3Glyph(Matrix, PDType3Font, int, Vector)insteadCalled when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphunicode- the Unicode text for this glyph, or null if the PDF does provide itdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
showType3Glyph
protected void showType3Glyph(Matrix textRenderingMatrix, PDType3Font font, int code, Vector displacement) throws IOException
Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
beginMarkedContentSequence
public void beginMarkedContentSequence(COSName tag, COSDictionary properties)
Called when a marked content group begins- Parameters:
tag- indicates the role or significance of the sequenceproperties- optional properties
-
endMarkedContentSequence
public void endMarkedContentSequence()
Called when a marked content group ends
-
processOperator
public void processOperator(String operation, List<COSBase> arguments) throws IOException
This is used to handle an operation.- Parameters:
operation- The operation to perform.arguments- The list of arguments.- Throws:
IOException- If there is an error processing the operation.
-
processOperator
protected void processOperator(Operator operator, List<COSBase> operands) throws IOException
This is used to handle an operation.- Parameters:
operator- The operation to perform.operands- The list of arguments.- Throws:
IOException- If there is an error processing the operation.
-
unsupportedOperator
protected void unsupportedOperator(Operator operator, List<COSBase> operands) throws IOException
Called when an unsupported operator is encountered.- Parameters:
operator- The unknown operator.operands- The list of operands.- Throws:
IOException- if something went wrong
-
operatorException
protected void operatorException(Operator operator, List<COSBase> operands, IOException e) throws IOException
Called when an exception is thrown by an operator.- Parameters:
operator- The unknown operator.operands- The list of operands.e- the thrown exception.- Throws:
IOException- if something went wrong
-
saveGraphicsState
public void saveGraphicsState()
Pushes the current graphics state to the stack.
-
restoreGraphicsState
public void restoreGraphicsState()
Pops the current graphics state from the stack.
-
saveGraphicsStack
protected final Deque<PDGraphicsState> saveGraphicsStack()
Saves the entire graphics stack.- Returns:
- the saved graphics state stack.
-
restoreGraphicsStack
protected final void restoreGraphicsStack(Deque<PDGraphicsState> snapshot)
Restores the entire graphics stack.- Parameters:
snapshot- the graphics state stack to be restored.
-
getGraphicsStackSize
public int getGraphicsStackSize()
- Returns:
- Returns the size of the graphicsStack.
-
getGraphicsState
public PDGraphicsState getGraphicsState()
- Returns:
- Returns the graphicsState.
-
getTextLineMatrix
public Matrix getTextLineMatrix()
- Returns:
- Returns the textLineMatrix.
-
setTextLineMatrix
public void setTextLineMatrix(Matrix value)
- Parameters:
value- The textLineMatrix to set.
-
getTextMatrix
public Matrix getTextMatrix()
- Returns:
- Returns the textMatrix.
-
setTextMatrix
public void setTextMatrix(Matrix value)
- Parameters:
value- The textMatrix to set.
-
setLineDashPattern
public void setLineDashPattern(COSArray array, int phase)
- Parameters:
array- dash arrayphase- dash phase
-
getResources
public PDResources getResources()
- Returns:
- the stream' resources. This is mainly to be used by the
OperatorProcessorclasses.
-
getCurrentPage
public PDPage getCurrentPage()
- Returns:
- the current page.
-
getInitialMatrix
public Matrix getInitialMatrix()
Gets the stream's initial matrix.- Returns:
- the initial matrix.
-
transformedPoint
public PointF transformedPoint(float x, float y)
Transforms a point using the CTM.- Parameters:
x- x-coordinate of the point to be transformed.y- y-coordinate of the point to be transformed.- Returns:
- the transformed point.
-
transformWidth
protected float transformWidth(float width)
Transforms a width using the CTM.- Parameters:
width- the width value to be transformed.- Returns:
- the transformed width value.
-
getLevel
public int getLevel()
Get the current level. This can be used to decide whether a recursion has done too deep and an operation should be skipped to avoid a stack overflow.- Returns:
- the current level.
-
increaseLevel
public void increaseLevel()
Increase the level. Call this before running a potentially recursive operation.
-
decreaseLevel
public void decreaseLevel()
Decrease the level. Call this after running a potentially recursive operation. A log message is shown if the level is below 0. This can happen if the level is not decreased after an operation is done, e.g. by using a "finally" block.
-
-