Package java.io

Class StreamTokenizer

java.lang.Object
java.io.StreamTokenizer

public class StreamTokenizer
extends Object
Parses a stream into a set of defined tokens, one at a time. The different types of tokens that can be found are numbers, identifiers, quoted strings, and different comment styles. The class can be used for limited processing of source code of programming languages like Java, although it is nowhere near a full parser.
  • Field Summary

    Fields
    Modifier and Type Field Description
    double nval
    Contains a number if the current token is a number (ttype == TT_NUMBER).
    String sval
    Contains a string if the current token is a word (ttype == TT_WORD).
    static int TT_EOF
    The constant representing the end of the stream.
    static int TT_EOL
    The constant representing the end of the line.
    static int TT_NUMBER
    The constant representing a number token.
    static int TT_WORD
    The constant representing a word token.
    int ttype
    After calling nextToken(), ttype contains the type of token that has been read.
  • Constructor Summary

    Constructors
    Constructor Description
    StreamTokenizer​(InputStream is)
    Deprecated.
    StreamTokenizer​(Reader r)
    Constructs a new StreamTokenizer with r as source reader.
  • Method Summary

    Modifier and Type Method Description
    void commentChar​(int ch)
    Specifies that the character ch shall be treated as a comment character.
    void eolIsSignificant​(boolean flag)
    Specifies whether the end of a line is significant and should be returned as TT_EOF in ttype by this tokenizer.
    int lineno()
    Returns the current line number.
    void lowerCaseMode​(boolean flag)
    Specifies whether word tokens should be converted to lower case when they are stored in sval.
    int nextToken()
    Parses the next token from this tokenizer's source stream or reader.
    void ordinaryChar​(int ch)
    Specifies that the character ch shall be treated as an ordinary character by this tokenizer.
    void ordinaryChars​(int low, int hi)
    Specifies that the characters in the range from low to hi shall be treated as an ordinary character by this tokenizer.
    void parseNumbers()
    Specifies that this tokenizer shall parse numbers.
    void pushBack()
    Indicates that the current token should be pushed back and returned again the next time nextToken() is called.
    void quoteChar​(int ch)
    Specifies that the character ch shall be treated as a quote character.
    void resetSyntax()
    Specifies that all characters shall be treated as ordinary characters.
    void slashSlashComments​(boolean flag)
    Specifies whether "slash-slash" (C++-style) comments shall be recognized.
    void slashStarComments​(boolean flag)
    Specifies whether "slash-star" (C-style) comments shall be recognized.
    String toString()
    Returns the state of this tokenizer in a readable format.
    void whitespaceChars​(int low, int hi)
    Specifies that the characters in the range from low to hi shall be treated as whitespace characters by this tokenizer.
    void wordChars​(int low, int hi)
    Specifies that the characters in the range from low to hi shall be treated as word characters by this tokenizer.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
  • Field Details

    • nval

      public double nval
      Contains a number if the current token is a number (ttype == TT_NUMBER).
    • sval

      public String sval
      Contains a string if the current token is a word (ttype == TT_WORD).
    • TT_EOF

      public static final int TT_EOF
      The constant representing the end of the stream.
      See Also:
      Constant Field Values
    • TT_EOL

      public static final int TT_EOL
      The constant representing the end of the line.
      See Also:
      Constant Field Values
    • TT_NUMBER

      public static final int TT_NUMBER
      The constant representing a number token.
      See Also:
      Constant Field Values
    • TT_WORD

      public static final int TT_WORD
      The constant representing a word token.
      See Also:
      Constant Field Values
    • ttype

      public int ttype
      After calling nextToken(), ttype contains the type of token that has been read. When a single character is read, its value converted to an integer is stored in ttype. For a quoted string, the value is the quoted character. Otherwise, its value is one of the following:
      • TT_WORD - the token is a word.
      • TT_NUMBER - the token is a number.
      • TT_EOL - the end of line has been reached. Depends on whether eolIsSignificant is true.
      • TT_EOF - the end of the stream has been reached.
  • Constructor Details

    • StreamTokenizer

      @Deprecated public StreamTokenizer​(InputStream is)
      Deprecated.
      Constructs a new StreamTokenizer with is as source input stream. This constructor is deprecated; instead, the constructor that takes a Reader as an argument should be used.
      Parameters:
      is - the source stream from which to parse tokens.
      Throws:
      NullPointerException - if is is null.
    • StreamTokenizer

      public StreamTokenizer​(Reader r)
      Constructs a new StreamTokenizer with r as source reader. The tokenizer's initial state is as follows:
      • All byte values 'A' through 'Z', 'a' through 'z', and '\u00A0' through '\u00FF' are considered to be alphabetic.
      • All byte values '\u0000' through '\u0020' are considered to be white space. '/' is a comment character.
      • Single quote '\'' and double quote '"' are string quote characters.
      • Numbers are parsed.
      • End of lines are considered to be white space rather than separate tokens.
      • C-style and C++-style comments are not recognized.
      Parameters:
      r - the source reader from which to parse tokens.
  • Method Details

    • commentChar

      public void commentChar​(int ch)
      Specifies that the character ch shall be treated as a comment character.
      Parameters:
      ch - the character to be considered a comment character.
    • eolIsSignificant

      public void eolIsSignificant​(boolean flag)
      Specifies whether the end of a line is significant and should be returned as TT_EOF in ttype by this tokenizer.
      Parameters:
      flag - true if EOL is significant, false otherwise.
    • lineno

      public int lineno()
      Returns the current line number.
      Returns:
      this tokenizer's current line number.
    • lowerCaseMode

      public void lowerCaseMode​(boolean flag)
      Specifies whether word tokens should be converted to lower case when they are stored in sval.
      Parameters:
      flag - true if sval should be converted to lower case, false otherwise.
    • nextToken

      public int nextToken() throws IOException
      Parses the next token from this tokenizer's source stream or reader. The type of the token is stored in the ttype field, additional information may be stored in the nval or sval fields.
      Returns:
      the value of ttype.
      Throws:
      IOException - if an I/O error occurs while parsing the next token.
    • ordinaryChar

      public void ordinaryChar​(int ch)
      Specifies that the character ch shall be treated as an ordinary character by this tokenizer. That is, it has no special meaning as a comment character, word component, white space, string delimiter or number.
      Parameters:
      ch - the character to be considered an ordinary character.
    • ordinaryChars

      public void ordinaryChars​(int low, int hi)
      Specifies that the characters in the range from low to hi shall be treated as an ordinary character by this tokenizer. That is, they have no special meaning as a comment character, word component, white space, string delimiter or number.
      Parameters:
      low - the first character in the range of ordinary characters.
      hi - the last character in the range of ordinary characters.
    • parseNumbers

      public void parseNumbers()
      Specifies that this tokenizer shall parse numbers.
    • pushBack

      public void pushBack()
      Indicates that the current token should be pushed back and returned again the next time nextToken() is called.
    • quoteChar

      public void quoteChar​(int ch)
      Specifies that the character ch shall be treated as a quote character.
      Parameters:
      ch - the character to be considered a quote character.
    • resetSyntax

      public void resetSyntax()
      Specifies that all characters shall be treated as ordinary characters.
    • slashSlashComments

      public void slashSlashComments​(boolean flag)
      Specifies whether "slash-slash" (C++-style) comments shall be recognized. This kind of comment ends at the end of the line.
      Parameters:
      flag - true if // should be recognized as the start of a comment, false otherwise.
    • slashStarComments

      public void slashStarComments​(boolean flag)
      Specifies whether "slash-star" (C-style) comments shall be recognized. Slash-star comments cannot be nested and end when a star-slash combination is found.
      Parameters:
      flag - true if /* should be recognized as the start of a comment, false otherwise.
    • toString

      public String toString()
      Returns the state of this tokenizer in a readable format.
      Overrides:
      toString in class Object
      Returns:
      the current state of this tokenizer.
    • whitespaceChars

      public void whitespaceChars​(int low, int hi)
      Specifies that the characters in the range from low to hi shall be treated as whitespace characters by this tokenizer.
      Parameters:
      low - the first character in the range of whitespace characters.
      hi - the last character in the range of whitespace characters.
    • wordChars

      public void wordChars​(int low, int hi)
      Specifies that the characters in the range from low to hi shall be treated as word characters by this tokenizer. A word consists of a word character followed by zero or more word or number characters.
      Parameters:
      low - the first character in the range of word characters.
      hi - the last character in the range of word characters.