Enum EUnicodeBOM

  • All Implemented Interfaces:
    Serializable, Comparable<EUnicodeBOM>

    public enum EUnicodeBOM
    extends Enum<EUnicodeBOM>
    Defines the most common Byte Order Markers for Unicode encoded text files.
    Source: http://de.wikipedia.org/wiki/Byte_Order_Mark
    Important: BOMS with more bytes should come first to avoid wrong detections.
    Note: SCSU = A Standard Compression Scheme for Unicode: http://www.unicode.org/reports/tr6/
    Note: BOCU = Binary Ordered Compression for Unicode
    Author:
    Philip Helger
    • Enum Constant Detail

      • BOM_UTF_32_BIG_ENDIAN

        public static final EUnicodeBOM BOM_UTF_32_BIG_ENDIAN
        UTF-32 Big Endian
      • BOM_UTF_32_LITTLE_ENDIAN

        public static final EUnicodeBOM BOM_UTF_32_LITTLE_ENDIAN
        UTF-32 Little Endian
      • BOM_UTF_7

        public static final EUnicodeBOM BOM_UTF_7
        UTF-7
      • BOM_UTF_7_ALT2

        public static final EUnicodeBOM BOM_UTF_7_ALT2
        UTF-7
      • BOM_UTF_7_ALT3

        public static final EUnicodeBOM BOM_UTF_7_ALT3
        UTF-7
      • BOM_UTF_7_ALT4

        public static final EUnicodeBOM BOM_UTF_7_ALT4
        UTF-7
      • BOM_UTF_EBCDIC

        public static final EUnicodeBOM BOM_UTF_EBCDIC
        UTF-EBCDIC
      • BOM_BOCU_1_ALT2

        public static final EUnicodeBOM BOM_BOCU_1_ALT2
        BOCU
      • BOM_GB_18030

        public static final EUnicodeBOM BOM_GB_18030
        GB 18030
      • BOM_UTF_8

        public static final EUnicodeBOM BOM_UTF_8
        UTF-8
      • BOM_UTF_1

        public static final EUnicodeBOM BOM_UTF_1
        UTF-1
      • BOM_BOCU_1

        public static final EUnicodeBOM BOM_BOCU_1
        BOCU
      • BOM_SCSU

        public static final EUnicodeBOM BOM_SCSU
        SCSU - Single-byte mode Quote Unicode
      • BOM_SCSU_TO_UCS

        public static final EUnicodeBOM BOM_SCSU_TO_UCS
        SCSU - Single-byte mode Change to Unicode
      • BOM_SCSU_W0_TO_FE80

        public static final EUnicodeBOM BOM_SCSU_W0_TO_FE80
        SCSU - Single-byte mode Define dynamic window 0 to 0xFE80
      • BOM_SCSU_W1_TO_FE80

        public static final EUnicodeBOM BOM_SCSU_W1_TO_FE80
        SCSU - Single-byte mode Define dynamic window 1 to 0xFE80
      • BOM_SCSU_W2_TO_FE80

        public static final EUnicodeBOM BOM_SCSU_W2_TO_FE80
        SCSU - Single-byte mode Define dynamic window 2 to 0xFE80
      • BOM_SCSU_W3_TO_FE80

        public static final EUnicodeBOM BOM_SCSU_W3_TO_FE80
        SCSU - Single-byte mode Define dynamic window 3 to 0xFE80
      • BOM_SCSU_W4_TO_FE80

        public static final EUnicodeBOM BOM_SCSU_W4_TO_FE80
        SCSU - Single-byte mode Define dynamic window 4 to 0xFE80
      • BOM_SCSU_W5_TO_FE80

        public static final EUnicodeBOM BOM_SCSU_W5_TO_FE80
        SCSU - Single-byte mode Define dynamic window 5 to 0xFE80
      • BOM_SCSU_W6_TO_FE80

        public static final EUnicodeBOM BOM_SCSU_W6_TO_FE80
        SCSU - Single-byte mode Define dynamic window 6 to 0xFE80
      • BOM_SCSU_W7_TO_FE80

        public static final EUnicodeBOM BOM_SCSU_W7_TO_FE80
        SCSU - Single-byte mode Define dynamic window 7 to 0xFE80
      • BOM_UTF_16_BIG_ENDIAN

        public static final EUnicodeBOM BOM_UTF_16_BIG_ENDIAN
        UTF-16 Big Endian
      • BOM_UTF_16_LITTLE_ENDIAN

        public static final EUnicodeBOM BOM_UTF_16_LITTLE_ENDIAN
        UTF-16 Little Endian
    • Method Detail

      • values

        public static EUnicodeBOM[] values()
        Returns an array containing the constants of this enum type, in the order they are declared. This method may be used to iterate over the constants as follows:
        for (EUnicodeBOM c : EUnicodeBOM.values())
            System.out.println(c);
        
        Returns:
        an array containing the constants of this enum type, in the order they are declared
      • valueOf

        public static EUnicodeBOM valueOf​(String name)
        Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)
        Parameters:
        name - the name of the enum constant to be returned.
        Returns:
        the enum constant with the specified name
        Throws:
        IllegalArgumentException - if this enum type has no constant with the specified name
        NullPointerException - if the argument is null
      • getByteCount

        @Nonnegative
        public int getByteCount()
        Returns:
        The number of bytes defining this BOM
      • isPresent

        public boolean isPresent​(@Nullable
                                 byte[] aBytes)
        Check if the passed byte array starts with this BOM's bytes.
        Parameters:
        aBytes - The byte array to search for a BOM. May be null or empty.
        Returns:
        true if the passed byte array starts with this BOM, false otherwise.
      • getCharsetName

        @Nullable
        public String getCharsetName()
        Returns:
        The name of the charset. This may be null if no known charset exists for Java. This string may be present, even if getCharset() returns null. To support e.g. "utf-7" you need to add additional JAR files.
      • getCharset

        @Nullable
        public Charset getCharset()
        Returns:
        The charset matching this BOM. May be null if the charset is not part of the Sun JDK or there is not even a defined charset.
      • hasCharset

        public boolean hasCharset()
        Returns:
        true if this BOM has an assigned charset, false if not.
        Since:
        9.0.0
        See Also:
        getCharset()
      • getMaximumByteCount

        @Nonnegative
        public static int getMaximumByteCount()
        Returns:
        The maximum number of bytes a BOM may have.
      • getFromBytesOrNull

        @Nullable
        public static EUnicodeBOM getFromBytesOrNull​(@Nullable
                                                     byte[] aBytes)
        Find the BOM that is matching the passed byte array.
        Parameters:
        aBytes - The bytes to be checked for the BOM. May be null. To check all BOMs, this array must have at least 4 (= getMaximumByteCount()) bytes.
        Returns:
        null if the passed bytes do not resemble a BOM.