Class CSVParser


  • public class CSVParser
    extends Object
    A very simple CSV parser released under a commercial-friendly license. This just implements splitting a single line into fields.
    Author:
    Glen Smith, Rainer Pruy, Philip Helger
    • Constructor Detail

      • CSVParser

        public CSVParser()
        Constructs CSVParser using a comma for the separator.
    • Method Detail

      • getSeparatorChar

        public char getSeparatorChar()
        Returns:
        The default separator for this parser.
      • setSeparatorChar

        @Nonnull
        public CSVParser setSeparatorChar​(char cSeparator)
        Sets the delimiter to use for separating entries.
        Parameters:
        cSeparator - the delimiter to use for separating entries
        Returns:
        this
      • getQuoteChar

        public char getQuoteChar()
        Returns:
        The default quotation character for this parser.
      • setQuoteChar

        @Nonnull
        public CSVParser setQuoteChar​(char cQuoteChar)
        Sets the character to use for quoted elements.
        Parameters:
        cQuoteChar - the character to use for quoted element.
        Returns:
        this
      • getEscapeChar

        public char getEscapeChar()
        Returns:
        The default escape character for this parser.
      • setEscapeChar

        @Nonnull
        public CSVParser setEscapeChar​(char cEscapeChar)
        Sets the character to use for escaping a separator or quote.
        Parameters:
        cEscapeChar - the character to use for escaping a separator or quote.
        Returns:
        this
      • isStrictQuotes

        public boolean isStrictQuotes()
        Returns:
        The default strictQuotes setting for this parser.
      • setStrictQuotes

        @Nonnull
        public CSVParser setStrictQuotes​(boolean bStrictQuotes)
        Sets the strict quotes setting - if true, characters outside the quotes are ignored.
        Parameters:
        bStrictQuotes - if true, characters outside the quotes are ignored
        Returns:
        this
      • isIgnoreLeadingWhiteSpace

        public boolean isIgnoreLeadingWhiteSpace()
        Returns:
        The default ignoreLeadingWhiteSpace setting for this parser.
      • setIgnoreLeadingWhiteSpace

        @Nonnull
        public CSVParser setIgnoreLeadingWhiteSpace​(boolean bIgnoreLeadingWhiteSpace)
        Sets the ignore leading whitespace setting - if true, white space in front of a quote in a field is ignored.
        Parameters:
        bIgnoreLeadingWhiteSpace - if true, white space in front of a quote in a field is ignored
        Returns:
        this
      • isIgnoreQuotations

        public boolean isIgnoreQuotations()
        Returns:
        the default ignoreQuotation setting for this parser.
      • setIgnoreQuotations

        @Nonnull
        public CSVParser setIgnoreQuotations​(boolean bIgnoreQuotations)
        Sets the ignore quotations mode - if true, quotations are ignored.
        Parameters:
        bIgnoreQuotations - if true, quotations are ignored
        Returns:
        this
      • isPending

        public boolean isPending()
        Returns:
        true if something was left over from last call(s)
      • parseLineMulti

        @Nullable
        public ICommonsList<String> parseLineMulti​(@Nullable
                                                   String sNextLine)
                                            throws IOException
        Parses an incoming String and returns an array of elements. This method is used when the data spans multiple lines.
        Parameters:
        sNextLine - current line to be processed
        Returns:
        the tokenized list of elements, or null if nextLine is null
        Throws:
        IOException - if bad things happen during the read
      • parseLine

        @Nullable
        public ICommonsList<String> parseLine​(@Nullable
                                              String sNextLine)
                                       throws IOException
        Parses an incoming String and returns an array of elements. This method is used when all data is contained in a single line.
        Parameters:
        sNextLine - Line to be parsed.
        Returns:
        the tokenized list of elements, or null if nextLine is null
        Throws:
        IOException - if bad things happen during the read
      • isNextCharacterEscapable

        protected boolean isNextCharacterEscapable​(@Nonnull
                                                   String sNextLine,
                                                   boolean bInQuotes,
                                                   int nIndex)
        Checks to see if the character after the current index in a String is an escapable character. Meaning the next character is either a quotation character or the escape char and you are inside quotes. precondition: the current character is an escape
        Parameters:
        sNextLine - the current line
        bInQuotes - true if the current context is quoted
        nIndex - current index in line
        Returns:
        true if the following character is a quote
      • isAllWhiteSpace

        protected boolean isAllWhiteSpace​(@Nonnull
                                          CharSequence sb)
        Checks if every element is the character sequence is whitespace. precondition: sb.length() is greater than 0
        Parameters:
        sb - A sequence of characters to examine
        Returns:
        true if every character in the sequence is whitespace