public class COSDocumentParser extends PDFParser
The parser will create a object representation of the pdf document using COS level objects.
The parser is a one pass, read everything implementation.
| Modifier and Type | Field and Description |
|---|---|
static int |
SEARCH_BUFFER_SIZE
use a buffer large than specified by the spec. we already had documents
with whitespace padding > 1024 bytes!
|
C_WARN_ARRAYSIZE, C_WARN_ENDOBJ_MISSING, C_WARN_ENDSTREAMCORRUPT, C_WARN_ENDSTREAMEOL, C_WARN_ILLEGALHEX, C_WARN_LARGE_INT, C_WARN_NAME_TOO_LONG, C_WARN_SINGLEEOL, C_WARN_SINGLEEOL_OBJ, C_WARN_SINGLESPACE, C_WARN_SINGLESPACE_OBJ, C_WARN_STREAMEOL, C_WARN_STREAMEXTERNAL, C_WARN_STREAMLENGTH, C_WARN_STRING_TOO_LONG, C_WARN_UNEVENHEX, CHAR_BS, CHAR_CR, CHAR_FF, CHAR_HT, CHAR_LF, TOKEN_def, TOKEN_endobj, TOKEN_endstream, TOKEN_EOF, TOKEN_false, TOKEN_FDFHEADER, TOKEN_ndstream, TOKEN_null, TOKEN_obj, TOKEN_PDFHEADER, TOKEN_R, TOKEN_s_tream, TOKEN_startxref, TOKEN_stream, TOKEN_trailer, TOKEN_true, TOKEN_xref| Constructor and Description |
|---|
COSDocumentParser(STDocument doc) |
| Modifier and Type | Method and Description |
|---|---|
STDocument |
getDoc() |
boolean |
isTokenXRefAt(de.intarsys.tools.randomaccess.IRandomAccess input,
int offset) |
COSObject |
parseIndirectObject(de.intarsys.tools.randomaccess.IRandomAccess input,
ISystemSecurityHandler securityHandler)
read a pdf style object from the input. see PDF Reference v1.4, chapter
3.2.9 Indirect Objects COSIndirectObject ::= ObjNum GenNum "obj" Object
"endobj"
|
int |
parseStartXRef(de.intarsys.tools.randomaccess.IRandomAccess input)
the startxref value.
|
COSDictionary |
parseTrailer(de.intarsys.tools.randomaccess.IRandomAccess input)
parse the trailer section from the current stream position. see PDF
Reference v1.4, chapter 3.4.4 File Trailer DocumentTrailer ::= "trailer"
COSDict "startxref" COSNumber
|
int |
searchLastStartXRef(de.intarsys.tools.randomaccess.IRandomAccess input)
Searches the offset to the first trailer in the last SEARCH_BUFFER_SIZE
bytes of the document.
|
int |
searchLinearized(de.intarsys.tools.randomaccess.IRandomAccess input)
Deprecated.
Don't use this anymore
Returns the offset of the dictionary with linearization
parameters if any. Returns -1 otherwise.
|
getExceptionHandler, handleError, handleWarning, isDelimiter, isDigit, isEOL, isNumberStart, isOctalDigit, isTokenStart, isWhitespace, parseElement, parseHeader, readInteger, readSpaces, readToken, readToken, setExceptionHandler, toCOSObjectpublic static final int SEARCH_BUFFER_SIZE
public COSDocumentParser(STDocument doc)
public STDocument getDoc()
public boolean isTokenXRefAt(de.intarsys.tools.randomaccess.IRandomAccess input,
int offset)
throws IOException
IOExceptionpublic COSObject parseIndirectObject(de.intarsys.tools.randomaccess.IRandomAccess input, ISystemSecurityHandler securityHandler) throws IOException, COSLoadException
IOExceptionCOSLoadExceptionpublic int parseStartXRef(de.intarsys.tools.randomaccess.IRandomAccess input)
throws IOException,
COSLoadException
IOExceptionCOSLoadExceptionpublic COSDictionary parseTrailer(de.intarsys.tools.randomaccess.IRandomAccess input) throws IOException, COSLoadException
IOExceptionCOSLoadExceptionpublic int searchLastStartXRef(de.intarsys.tools.randomaccess.IRandomAccess input)
throws IOException,
COSLoadException
IOExceptionCOSLoadException@Deprecated public int searchLinearized(de.intarsys.tools.randomaccess.IRandomAccess input) throws IOException, COSLoadException
input - IOExceptionCOSLoadExceptionCopyright © 2013 intarsys consulting GmbH. All Rights Reserved.