Class ContentStreamParser

java.lang.Object
org.sejda.sambox.input.ContentStreamParser
All Implemented Interfaces:
Closeable, AutoCloseable

public class ContentStreamParser extends Object
Component responsible for parsing a content stream to extract operands and such.
Author:
Andrea Vacondio
  • Field Details

  • Constructor Details

    • ContentStreamParser

      public ContentStreamParser(PDContentStream stream) throws IOException
      Throws:
      IOException
    • ContentStreamParser

      public ContentStreamParser(org.sejda.io.SeekableSource source)
  • Method Details

    • tokens

      public List<Object> tokens() throws IOException
      Returns:
      a list of tokens retrieved parsing the source this parser was created from.
      Throws:
      IOException
    • nextParsedToken

      public Object nextParsedToken() throws IOException
      Returns:
      the next token parsed from the content stream
      Throws:
      IOException
    • close

      public void close() throws IOException
      Closes the SeekableSource this reader was created from.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException
    • source

      public org.sejda.io.SeekableSource source()
      Returns:
      the source for this reader
    • position

      public long position() throws IOException
      Returns:
      the current position
      Throws:
      IOException
    • offset

      public void offset(long offset) throws IOException
      adds an offset to the source
      Parameters:
      offset -
      Throws:
      IOException
    • position

      public void position(long offset) throws IOException
      seeks to the given offset
      Parameters:
      offset - the new offset
      Throws:
      IOException
    • length

      public long length()
      Returns:
      the source length
    • skipExpected

      public final void skipExpected(String expected) throws IOException
      Skips the expected given String
      Parameters:
      expected - the String value that is expected.
      Throws:
      IOException - if the String char is not the expected value or if an I/O error occurs.
    • skipExpected

      public void skipExpected(char ec) throws IOException
      Skips one char and throws an exception if it is not the expected value.
      Parameters:
      ec - the char value that is expected.
      Throws:
      IOException - if the read char is not the expected value or if an I/O error occurs.
    • skipTokenIfValue

      public boolean skipTokenIfValue(String... values) throws IOException
      Skips the next token if it's value is one of the given ones
      Parameters:
      values - the values to skip
      Returns:
      true if the token is found and skipped, false otherwise.
      Throws:
      IOException - if there is an error reading from the stream
    • skipIndirectObjectDefinition

      public void skipIndirectObjectDefinition() throws IOException
      Skips an indirect object definition open tag (Ex. "12 0 obj") as defined in the chap 7.3.10 PDF 32000-1:2008.
      Throws:
      IOException - if we are reading a not valid indirect object definition open tag
    • skipExpectedIndirectObjectDefinition

      public void skipExpectedIndirectObjectDefinition(COSObjectKey expected) throws IOException
      Skips an indirect object definition open tag (Ex. "12 0 obj") as defined in the chap 7.3.10 PDF 32000-1:2008.
      Parameters:
      expected - object we are expecting to find
      Throws:
      IOException - if we are reading a not valid indirect object definition open tag or the object number or generation number don't match the expected object
    • readToken

      public String readToken() throws IOException
      Returns:
      The next token that was read from the stream.
      Throws:
      IOException - If there is an error reading from the stream.
      See Also:
    • unreadSpaces

      public void unreadSpaces() throws IOException
      Unreads white spaces
      Throws:
      IOException
    • unreadUntilSpaces

      public void unreadUntilSpaces() throws IOException
      Unreads characters until it finds a white space
      Throws:
      IOException
    • isNextToken

      public boolean isNextToken(String... values) throws IOException
      Parameters:
      values - values for the next token.
      Returns:
      true if the next token is one of the given values. false otherwise.
      Throws:
      IOException - if there is an error reading from the stream
    • readLine

      public String readLine() throws IOException
      Reads bytes until the first end of line marker occurs. NOTE: The EOL marker may consists of 1 (CR or LF) or 2 (CR and CL) bytes which is an important detail if one wants to unread the line.
      Returns:
      The characters between the current position and the end of the line.
      Throws:
      IOException - If there is an error reading from the stream.
    • readObjectNumber

      public long readObjectNumber() throws IOException
      Reads a long and throws an IOException if the long value is negative or has more than 10 digits (i.e. : bigger than OBJECT_NUMBER_THRESHOLD)
      Returns:
      the object number being read.
      Throws:
      IOException - if an I/O error occurs
    • readGenerationNumber

      public int readGenerationNumber() throws IOException
      reads an integer and throws an IOException if the integer value has more than the maximum object revision (i.e. : bigger than GENERATION_NUMBER_THRESHOLD)
      Returns:
      the generation number being read.
      Throws:
      IOException - if an I/O error occurs
    • readName

      public String readName() throws IOException
      Reads a token conforming with PDF Name Objects chap 7.3.5 PDF 32000-1:2008.
      Returns:
      the generation number being read.
      Throws:
      IOException - if an I/O error occurs
    • readInt

      public int readInt() throws IOException
      Returns:
      The integer that was read from the stream.
      Throws:
      IOException - If there is an error reading from the stream.
    • readLong

      public long readLong() throws IOException
      Returns:
      The long that was read from the stream.
      Throws:
      IOException - If there is an error reading from the stream.
    • readIntegerNumber

      public final String readIntegerNumber() throws IOException
      Reads a token conforming with a PDF Integer object defined in Numeric Objects chap 7.3.3 PDF 32000-1:2008.
      Returns:
      the token to parse as Integer or Long.
      Throws:
      IOException - If there is an error reading from the stream.
    • readNumber

      public final String readNumber() throws IOException
      Reads a token conforming with PDF Numeric Objects chap 7.3.3 PDF 32000-1:2008.
      Returns:
      the token to parse as integer or real number.
      Throws:
      IOException - If there is an error reading from the stream.
    • readHexString

      public final String readHexString() throws IOException
      Reads a token conforming with PDF Hexadecimal Strings chap 7.3.4.3 PDF 32000-1:2008. Any non hexadecimal char found while parsing the token is replace with the default '0' hex char.
      Returns:
      the token to parse as an hexadecimal string
      Throws:
      IOException - If there is an error reading from the stream.
    • readLiteralString

      public String readLiteralString() throws IOException
      Reads a token conforming with PDF Literal Strings chap 7.3.4.2 PDF 32000-1:2008.
      Returns:
      the token to parse as a literal string
      Throws:
      IOException - If there is an error during parsing.
    • skipSpaces

      public void skipSpaces() throws IOException
      Skips all spaces and comments that are present.
      Throws:
      IOException - If there is an error reading from the stream.
    • unreadIfValid

      public void unreadIfValid(int c) throws IOException
      Unreads the given character if it's not -1
      Parameters:
      c -
      Throws:
      IOException