Package com.tom_roush.pdfbox.pdfparser
Class BaseParser
- java.lang.Object
-
- com.tom_roush.pdfbox.pdfparser.BaseParser
-
- Direct Known Subclasses:
COSParser,PDFObjectStreamParser,PDFStreamParser,PDFXrefStreamParser
public abstract class BaseParser extends Object
This class is used to contain parsing logic that will be used by both the PDFParser and the COSStreamParser.
-
-
Field Summary
Fields Modifier and Type Field Description protected static intAprotected static byteASCII_CRASCII code for carriage return.protected static byteASCII_LFASCII code for line feed.protected static intBprotected static intDstatic StringDEFThis is a string constant that will be used for comparisons.protected COSDocumentdocumentThis is the document that will be parsed.protected static intEprotected static StringENDOBJ_STRINGThis is a string constant that will be used for comparisons.protected static StringENDSTREAM_STRINGThis is a string constant that will be used for comparisons.protected static intJprotected static intMprotected static intNprotected static intOprotected static intRprotected static intSprotected static StringSTREAM_STRINGThis is a string constant that will be used for comparisons.protected static intT
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected booleanisClosing()This will tell if the next character is a closing brace( close of PDF array ).protected booleanisClosing(int c)This will tell if the next character is a closing brace( close of PDF array ).protected booleanisDigit()This will tell if the next byte is a digit or not.protected static booleanisDigit(int c)This will tell if the given value is a digit or not.protected booleanisEndOfName(int ch)Determine if a character terminates a PDF name.protected booleanisEOL()This will tell if the next byte to be read is an end of line byte.protected booleanisEOL(int c)This will tell if the next byte to be read is an end of line byte.protected booleanisSpace()This will tell if the next byte is a space or not.protected booleanisSpace(int c)This will tell if the given value is a space or not.protected booleanisWhitespace()This will tell if the next byte is whitespace or not.protected booleanisWhitespace(int c)This will tell if a character is whitespace or not.protected COSBooleanparseBoolean()This will parse a boolean object from the stream.protected COSArrayparseCOSArray()This will parse a PDF array object.protected COSDictionaryparseCOSDictionary()This will parse a PDF dictionary.protected COSNameparseCOSName()This will parse a PDF name from the stream.protected COSStringparseCOSString()This will parse a PDF string.protected COSBaseparseDirObject()This will parse a directory object from the stream.protected voidreadExpectedChar(char ec)Read one char and throw an exception if it is not the expected value.protected voidreadExpectedString(char[] expectedString, boolean skipSpaces)Reads given pattern fromseqSource.protected voidreadExpectedString(String expectedString)Read one String and throw an exception if it is not the expected value.protected intreadGenerationNumber()This will read a integer from the Stream and throw anIllegalArgumentExceptionif the integer value has more than the maximum object revision (i.e.protected intreadInt()This will read an integer from the stream.protected StringreadLine()This will read bytes until the first end of line marker occurs.protected longreadLong()This will read an long from the stream.protected longreadObjectNumber()This will read a long from the Stream and throw anIOExceptionif the long value is negative or has more than 10 digits (i.e.protected StringreadString()This will read the next string from the stream.protected StringreadString(int length)This will read the next string from the stream up to a certain length.protected StringBuilderreadStringNumber()This method is used to read a token by the readInt() and the readLong() method.protected voidskipSpaces()This will skip all spaces and comments that are present.protected voidskipWhiteSpaces()
-
-
-
Field Detail
-
E
protected static final int E
- See Also:
- Constant Field Values
-
N
protected static final int N
- See Also:
- Constant Field Values
-
D
protected static final int D
- See Also:
- Constant Field Values
-
S
protected static final int S
- See Also:
- Constant Field Values
-
T
protected static final int T
- See Also:
- Constant Field Values
-
R
protected static final int R
- See Also:
- Constant Field Values
-
A
protected static final int A
- See Also:
- Constant Field Values
-
M
protected static final int M
- See Also:
- Constant Field Values
-
O
protected static final int O
- See Also:
- Constant Field Values
-
B
protected static final int B
- See Also:
- Constant Field Values
-
J
protected static final int J
- See Also:
- Constant Field Values
-
DEF
public static final String DEF
This is a string constant that will be used for comparisons.- See Also:
- Constant Field Values
-
ENDOBJ_STRING
protected static final String ENDOBJ_STRING
This is a string constant that will be used for comparisons.- See Also:
- Constant Field Values
-
ENDSTREAM_STRING
protected static final String ENDSTREAM_STRING
This is a string constant that will be used for comparisons.- See Also:
- Constant Field Values
-
STREAM_STRING
protected static final String STREAM_STRING
This is a string constant that will be used for comparisons.- See Also:
- Constant Field Values
-
ASCII_LF
protected static final byte ASCII_LF
ASCII code for line feed.- See Also:
- Constant Field Values
-
ASCII_CR
protected static final byte ASCII_CR
ASCII code for carriage return.- See Also:
- Constant Field Values
-
document
protected COSDocument document
This is the document that will be parsed.
-
-
Method Detail
-
parseCOSDictionary
protected COSDictionary parseCOSDictionary() throws IOException
This will parse a PDF dictionary.- Returns:
- The parsed dictionary, never null.
- Throws:
IOException- If there is an error reading the stream.
-
skipWhiteSpaces
protected void skipWhiteSpaces() throws IOException- Throws:
IOException
-
parseCOSString
protected COSString parseCOSString() throws IOException
This will parse a PDF string.- Returns:
- The parsed PDF string.
- Throws:
IOException- If there is an error reading from the stream.
-
parseCOSArray
protected COSArray parseCOSArray() throws IOException
This will parse a PDF array object.- Returns:
- The parsed PDF array.
- Throws:
IOException- If there is an error parsing the stream.
-
isEndOfName
protected boolean isEndOfName(int ch)
Determine if a character terminates a PDF name.- Parameters:
ch- The character- Returns:
- true if the character terminates a PDF name, otherwise false.
-
parseCOSName
protected COSName parseCOSName() throws IOException
This will parse a PDF name from the stream.- Returns:
- The parsed PDF name.
- Throws:
IOException- If there is an error reading from the stream.
-
parseBoolean
protected COSBoolean parseBoolean() throws IOException
This will parse a boolean object from the stream.- Returns:
- The parsed boolean object.
- Throws:
IOException- If an IO error occurs during parsing.
-
parseDirObject
protected COSBase parseDirObject() throws IOException
This will parse a directory object from the stream.- Returns:
- The parsed object.
- Throws:
IOException- If there is an error during parsing.
-
readString
protected String readString() throws IOException
This will read the next string from the stream.- Returns:
- The string that was read from the stream, never null.
- Throws:
IOException- If there is an error reading from the stream.
-
readExpectedString
protected void readExpectedString(String expectedString) throws IOException
Read one String and throw an exception if it is not the expected value.- Parameters:
expectedString- the String value that is expected.- Throws:
IOException- if the String char is not the expected value or if an I/O error occurs.
-
readExpectedString
protected final void readExpectedString(char[] expectedString, boolean skipSpaces) throws IOExceptionReads given pattern fromseqSource. Skipping whitespace at start and end if wanted.- Parameters:
expectedString- pattern to be skippedskipSpaces- if set to true spaces before and after the string will be skipped- Throws:
IOException- if pattern could not be read
-
readExpectedChar
protected void readExpectedChar(char ec) throws IOExceptionRead one char and throw an exception if it is not the expected value.- Parameters:
ec- the char value that is expected.- Throws:
IOException- if the read char is not the expected value or if an I/O error occurs.
-
readString
protected String readString(int length) throws IOException
This will read the next string from the stream up to a certain length.- Parameters:
length- The length to stop reading at.- Returns:
- The string that was read from the stream of length 0 to length.
- Throws:
IOException- If there is an error reading from the stream.
-
isClosing
protected boolean isClosing() throws IOExceptionThis will tell if the next character is a closing brace( close of PDF array ).- Returns:
- true if the next byte is ']', false otherwise.
- Throws:
IOException- If an IO error occurs.
-
isClosing
protected boolean isClosing(int c)
This will tell if the next character is a closing brace( close of PDF array ).- Parameters:
c- The character to check against end of line- Returns:
- true if the next byte is ']', false otherwise.
-
readLine
protected String readLine() throws IOException
This will read bytes until the first end of line marker occurs. NOTE: The EOL marker may consists of 1 (CR or LF) or 2 (CR and CL) bytes which is an important detail if one wants to unread the line.- Returns:
- The characters between the current position and the end of the line.
- Throws:
IOException- If there is an error reading from the stream.
-
isEOL
protected boolean isEOL() throws IOExceptionThis will tell if the next byte to be read is an end of line byte.- Returns:
- true if the next byte is 0x0A or 0x0D.
- Throws:
IOException- If there is an error reading from the stream.
-
isEOL
protected boolean isEOL(int c)
This will tell if the next byte to be read is an end of line byte.- Parameters:
c- The character to check against end of line- Returns:
- true if the next byte is 0x0A or 0x0D.
-
isWhitespace
protected boolean isWhitespace() throws IOExceptionThis will tell if the next byte is whitespace or not.- Returns:
- true if the next byte in the stream is a whitespace character.
- Throws:
IOException- If there is an error reading from the stream.
-
isWhitespace
protected boolean isWhitespace(int c)
This will tell if a character is whitespace or not. These values are specified in table 1 (page 12) of ISO 32000-1:2008.- Parameters:
c- The character to check against whitespace- Returns:
- true if the character is a whitespace character.
-
isSpace
protected boolean isSpace() throws IOExceptionThis will tell if the next byte is a space or not.- Returns:
- true if the next byte in the stream is a space character.
- Throws:
IOException- If there is an error reading from the stream.
-
isSpace
protected boolean isSpace(int c)
This will tell if the given value is a space or not.- Parameters:
c- The character to check against space- Returns:
- true if the next byte in the stream is a space character.
-
isDigit
protected boolean isDigit() throws IOExceptionThis will tell if the next byte is a digit or not.- Returns:
- true if the next byte in the stream is a digit.
- Throws:
IOException- If there is an error reading from the stream.
-
isDigit
protected static boolean isDigit(int c)
This will tell if the given value is a digit or not.- Parameters:
c- The character to be checked- Returns:
- true if the next byte in the stream is a digit.
-
skipSpaces
protected void skipSpaces() throws IOExceptionThis will skip all spaces and comments that are present.- Throws:
IOException- If there is an error reading from the stream.
-
readObjectNumber
protected long readObjectNumber() throws IOExceptionThis will read a long from the Stream and throw anIOExceptionif the long value is negative or has more than 10 digits (i.e. : bigger thanOBJECT_NUMBER_THRESHOLD)- Returns:
- the object number being read.
- Throws:
IOException- if an I/O error occurs
-
readGenerationNumber
protected int readGenerationNumber() throws IOExceptionThis will read a integer from the Stream and throw anIllegalArgumentExceptionif the integer value has more than the maximum object revision (i.e. : bigger thanGENERATION_NUMBER_THRESHOLD)- Returns:
- the generation number being read.
- Throws:
IOException- if an I/O error occurs
-
readInt
protected int readInt() throws IOExceptionThis will read an integer from the stream.- Returns:
- The integer that was read from the stream.
- Throws:
IOException- If there is an error reading from the stream.
-
readLong
protected long readLong() throws IOExceptionThis will read an long from the stream.- Returns:
- The long that was read from the stream.
- Throws:
IOException- If there is an error reading from the stream.
-
readStringNumber
protected final StringBuilder readStringNumber() throws IOException
This method is used to read a token by the readInt() and the readLong() method. Valid delimiters are any non digit values.- Returns:
- the token to parse as integer or long by the calling method.
- Throws:
IOException- throws by theseqSourcemethods.
-
-