Package com.tom_roush.pdfbox.pdfparser
Class PDFParser
- java.lang.Object
-
- com.tom_roush.pdfbox.pdfparser.BaseParser
-
- com.tom_roush.pdfbox.pdfparser.COSParser
-
- com.tom_roush.pdfbox.pdfparser.PDFParser
-
public class PDFParser extends COSParser
-
-
Field Summary
-
Fields inherited from class com.tom_roush.pdfbox.pdfparser.BaseParser
A, ASCII_CR, ASCII_LF, B, D, DEF, document, E, ENDOBJ_STRING, ENDSTREAM_STRING, J, M, N, O, R, S, seqSource, STREAM_STRING, T
-
Fields inherited from class com.tom_roush.pdfbox.pdfparser.COSParser
ENDOBJ, ENDSTREAM, EOF_MARKER, fileLen, initialParseDone, OBJ_MARKER, securityHandler, source, SYSPROP_EOFLOOKUPRANGE, SYSPROP_PARSEMINIMAL, TMP_FILE_PREFIX, xrefTrailerResolver
-
-
Constructor Summary
Constructors Constructor Description PDFParser(RandomAccessRead source)Constructor.PDFParser(RandomAccessRead source, boolean useScratchFiles)Constructor.PDFParser(RandomAccessRead source, String decryptionPassword)Constructor.PDFParser(RandomAccessRead source, String decryptionPassword, boolean useScratchFiles)Constructor.PDFParser(RandomAccessRead source, String decryptionPassword, InputStream keyStore, String alias)Constructor.PDFParser(RandomAccessRead source, String decryptionPassword, InputStream keyStore, String alias, boolean useScratchFiles)Constructor.PDFParser(RandomAccessRead source, String decryptionPassword, InputStream keyStore, String alias, ScratchFile scratchFile)Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description PDDocumentgetPDDocument()This will get the PD document that was parsed.protected voidinitialParse()The initial parse will first parse only the trailer, the xrefstart and all xref tables to have a pointer (offset) to all the pdf's objects.voidparse()This will parse the stream and populate the COSDocument object.-
Methods inherited from class com.tom_roush.pdfbox.pdfparser.BaseParser
isClosing, isClosing, isDigit, isDigit, isEndOfName, isEOL, isEOL, isSpace, isSpace, isWhitespace, isWhitespace, parseBoolean, parseCOSArray, parseCOSDictionary, parseCOSName, parseCOSString, parseDirObject, readExpectedChar, readExpectedString, readExpectedString, readGenerationNumber, readInt, readLine, readLong, readObjectNumber, readString, readString, readStringNumber, skipSpaces, skipWhiteSpace
-
Methods inherited from class com.tom_roush.pdfbox.pdfparser.COSParser
getDocument, getStartxrefOffset, isLenient, lastIndexOf, parseCOSStream, parseDictObjects, parseFDFHeader, parseObjectDynamically, parseObjectDynamically, parsePDFHeader, parseStartXref, parseTrailer, parseTrailerValuesDynamically, parseXref, parseXrefStream, parseXrefTable, rebuildTrailer, setEOFLookupRange, setLenient
-
-
-
-
Constructor Detail
-
PDFParser
public PDFParser(RandomAccessRead source) throws IOException
Constructor.- Parameters:
source- input representing the pdf.- Throws:
IOException- If something went wrong.
-
PDFParser
public PDFParser(RandomAccessRead source, boolean useScratchFiles) throws IOException
Constructor.- Parameters:
source- input representing the pdf.useScratchFiles- use a fiel based buffer for temporary storage.- Throws:
IOException- If something went wrong.
-
PDFParser
public PDFParser(RandomAccessRead source, String decryptionPassword) throws IOException
Constructor.- Parameters:
source- input representing the pdf.decryptionPassword- password to be used for decryption.- Throws:
IOException- If something went wrong.
-
PDFParser
public PDFParser(RandomAccessRead source, String decryptionPassword, boolean useScratchFiles) throws IOException
Constructor.- Parameters:
source- input representing the pdf.decryptionPassword- password to be used for decryption.useScratchFiles- use a buffer for temporary storage.- Throws:
IOException- If something went wrong.
-
PDFParser
public PDFParser(RandomAccessRead source, String decryptionPassword, InputStream keyStore, String alias) throws IOException
Constructor.- Parameters:
source- input representing the pdf.decryptionPassword- password to be used for decryption.keyStore- key store to be used for decryption when using public key securityalias- alias to be used for decryption when using public key security- Throws:
IOException- If something went wrong.
-
PDFParser
public PDFParser(RandomAccessRead source, String decryptionPassword, InputStream keyStore, String alias, boolean useScratchFiles) throws IOException
Constructor.- Parameters:
source- input representing the pdf.decryptionPassword- password to be used for decryption.keyStore- key store to be used for decryption when using public key securityalias- alias to be used for decryption when using public key securityuseScratchFiles- use a buffer for temporary storage.- Throws:
IOException- If something went wrong.
-
PDFParser
public PDFParser(RandomAccessRead source, String decryptionPassword, InputStream keyStore, String alias, ScratchFile scratchFile) throws IOException
Constructor.- Parameters:
source- input representing the pdf.decryptionPassword- password to be used for decryption.keyStore- key store to be used for decryption when using public key securityalias- alias to be used for decryption when using public key securityscratchFile- buffer handler for temporary storage; it will be closed onCOSDocument.close()- Throws:
IOException- If something went wrong.
-
-
Method Detail
-
getPDDocument
public PDDocument getPDDocument() throws IOException
This will get the PD document that was parsed. When you are done with this document you must call close() on it to release resources.- Returns:
- The document at the PD layer.
- Throws:
IOException- If there is an error getting the document.
-
initialParse
protected void initialParse() throws IOExceptionThe initial parse will first parse only the trailer, the xrefstart and all xref tables to have a pointer (offset) to all the pdf's objects. It can handle linearized pdfs, which will have an xref at the end pointing to an xref at the beginning of the file. Last the root object is parsed.- Throws:
IOException- If something went wrong.
-
parse
public void parse() throws IOExceptionThis will parse the stream and populate the COSDocument object. This will close the stream when it is done parsing.- Throws:
IOException- If there is an error reading from the stream or corrupt data is found.
-
-