java.lang.Object
com.github.shyiko.mysql.binlog.event.deserialization.json.JsonBinary

public class JsonBinary extends Object
Utility to parse the binary-encoded value of a MySQL JSON type, translating the encoded representation into method calls on a supplied JsonFormatter implementation.

Binary Format

Each JSON value (scalar, object or array) has a one byte type identifier followed by the actual value.

Scalar

The binary value may contain a single scalar that is one of:
  • null
  • boolean
  • int16
  • int32
  • int64
  • uint16
  • uint32
  • uint64
  • double
  • string
  • DATE as a string of the form YYYY-MM-DD where YYYY can be positive or negative
  • TIME as a string of the form HH-MM-SS where HH can be positive or negative
  • DATETIME as a string of the form YYYY-MM-DD HH-mm-SS.ssssss where YYYY can be positive or negative
  • TIMESTAMP as the number of microseconds past epoch (January 1, 1970), or if negative the number of microseconds before epoch (January 1, 1970)
  • any other MySQL value encoded as an opaque binary value

JSON Object

If the value is a JSON object, its binary representation will have a header that contains:
  • the member count
  • the size of the binary value in bytes
  • a list of pointers to each key
  • a list of pointers to each value
The actual keys and values will come after the header, in the same order as in the header.

JSON Array

If the value is a JSON array, the binary representation will have a header with
  • the element count
  • the size of the binary value in bytes
  • a list of pointers to each value
followed by the actual values, in the same order as in the header.

Grammar

The grammar of the binary representation of JSON objects are defined in the MySQL codebase in the json_binary.h file:
   doc ::= type value
   type ::=
       0x00 |  // small JSON object
       0x01 |  // large JSON object
       0x02 |  // small JSON array
       0x03 |  // large JSON array
       0x04 |  // literal (true/false/null)
       0x05 |  // int16
       0x06 |  // uint16
       0x07 |  // int32
       0x08 |  // uint32
       0x09 |  // int64
       0x0a |  // uint64
       0x0b |  // double
       0x0c |  // utf8mb4 string
       0x0f    // custom data (any MySQL data type)
   value ::=
       object  |
       array   |
       literal |
       number  |
       string  |
       custom-data
   object ::= element-count size key-entry* value-entry* key* value*
   array ::= element-count size value-entry* value*
   // number of members in object or number of elements in array
   element-count ::=
       uint16 |  // if used in small JSON object/array
       uint32    // if used in large JSON object/array
   // number of bytes in the binary representation of the object or array
   size ::=
       uint16 |  // if used in small JSON object/array
       uint32    // if used in large JSON object/array
   key-entry ::= key-offset key-length
   key-offset ::=
       uint16 |  // if used in small JSON object
       uint32    // if used in large JSON object
   key-length ::= uint16    // key length must be less than 64KB
   value-entry ::= type offset-or-inlined-value
   // This field holds either the offset to where the value is stored,
   // or the value itself if it is small enough to be inlined (that is,
   // if it is a JSON literal or a small enough [u]int).
   offset-or-inlined-value ::=
       uint16 |   // if used in small JSON object/array
       uint32     // if used in large JSON object/array
   key ::= utf8mb4-data
   literal ::=
       0x00 |   // JSON null literal
       0x01 |   // JSON true literal
       0x02 |   // JSON false literal
   number ::=  ....  // little-endian format for [u]int(16|32|64), whereas
                     // double is stored in a platform-independent, eight-byte
                     // format using float8store()
   string ::= data-length utf8mb4-data
   custom-data ::= custom-type data-length binary-data
   custom-type ::= uint8   // type identifier that matches the
                           // internal enum_field_types enum
   data-length ::= uint8*  // If the high bit of a byte is 1, the length
                           // field is continued in the next byte,
                           // otherwise it is the last byte of the length
                           // field. So we need 1 byte to represent
                           // lengths up to 127, 2 bytes to represent
                           // lengths up to 16383, and so on...
 
Author:
Randall Hauch
  • Constructor Details

    • JsonBinary

      public JsonBinary(byte[] bytes)
    • JsonBinary

      public JsonBinary(ByteArrayInputStream contents)
  • Method Details

    • parseAsString

      public static String parseAsString(byte[] bytes) throws IOException
      Parse the MySQL binary representation of a JSON value and return the JSON string representation.

      This method is equivalent to parse(byte[], JsonFormatter) using the JsonStringFormatter.

      Parameters:
      bytes - the binary representation; may not be null
      Returns:
      the JSON string representation; never null
      Throws:
      IOException - if there is a problem reading or processing the binary representation
    • parse

      public static void parse(byte[] bytes, JsonFormatter formatter) throws IOException
      Parse the MySQL binary representation of a JSON value and call the supplied JsonFormatter for the various components of the value.
      Parameters:
      bytes - the binary representation; may not be null
      formatter - the formatter that will be called as the binary representation is parsed; may not be null
      Throws:
      IOException - if there is a problem reading or processing the binary representation
    • getString

      public String getString()
    • parse

      public void parse(JsonFormatter formatter) throws IOException
      Throws:
      IOException
    • parse

      protected void parse(ValueType type, JsonFormatter formatter) throws IOException
      Throws:
      IOException
    • parseObject

      protected void parseObject(boolean small, JsonFormatter formatter) throws IOException
      Parse a JSON object.

      The grammar of the binary representation of JSON objects are defined in the MySQL code base in the json_binary.h file:

      Grammar

         value ::=
             object  |
             array   |
             literal |
             number  |
             string  |
             custom-data
         object ::= element-count size key-entry* value-entry* key* value*
         // number of members in object or number of elements in array
         element-count ::=
             uint16 |  // if used in small JSON object/array
             uint32    // if used in large JSON object/array
         // number of bytes in the binary representation of the object or array
         size ::=
             uint16 |  // if used in small JSON object/array
             uint32    // if used in large JSON object/array
         key-entry ::= key-offset key-length
         key-offset ::=
             uint16 |  // if used in small JSON object
             uint32    // if used in large JSON object
         key-length ::= uint16    // key length must be less than 64KB
         value-entry ::= type offset-or-inlined-value
         // This field holds either the offset to where the value is stored,
         // or the value itself if it is small enough to be inlined (that is,
         // if it is a JSON literal or a small enough [u]int).
         offset-or-inlined-value ::=
             uint16 |   // if used in small JSON object/array
             uint32     // if used in large JSON object/array
         key ::= utf8mb4-data
         literal ::=
             0x00 |   // JSON null literal
             0x01 |   // JSON true literal
             0x02 |   // JSON false literal
         number ::=  ....  // little-endian format for [u]int(16|32|64), whereas
                           // double is stored in a platform-independent, eight-byte
                           // format using float8store()
         string ::= data-length utf8mb4-data
         custom-data ::= custom-type data-length binary-data
         custom-type ::= uint8   // type identifier that matches the
                                 // internal enum_field_types enum
         data-length ::= uint8*  // If the high bit of a byte is 1, the length
                                 // field is continued in the next byte,
                                 // otherwise it is the last byte of the length
                                 // field. So we need 1 byte to represent
                                 // lengths up to 127, 2 bytes to represent
                                 // lengths up to 16383, and so on...
       
      Parameters:
      small - true if the object being read is "small", or false otherwise
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseArray

      protected void parseArray(boolean small, JsonFormatter formatter) throws IOException
      Parse a JSON array.

      The grammar of the binary representation of JSON objects are defined in the MySQL code base in the json_binary.h file, and are:

      Grammar

      Grammar

         value ::=
             object  |
             array   |
             literal |
             number  |
             string  |
             custom-data
         array ::= element-count size value-entry* value*
         // number of members in object or number of elements in array
         element-count ::=
             uint16 |  // if used in small JSON object/array
             uint32    // if used in large JSON object/array
         // number of bytes in the binary representation of the object or array
         size ::=
             uint16 |  // if used in small JSON object/array
             uint32    // if used in large JSON object/array
         value-entry ::= type offset-or-inlined-value
         // This field holds either the offset to where the value is stored,
         // or the value itself if it is small enough to be inlined (that is,
         // if it is a JSON literal or a small enough [u]int).
         offset-or-inlined-value ::=
             uint16 |   // if used in small JSON object/array
             uint32     // if used in large JSON object/array
         key ::= utf8mb4-data
         literal ::=
             0x00 |   // JSON null literal
             0x01 |   // JSON true literal
             0x02 |   // JSON false literal
         number ::=  ....  // little-endian format for [u]int(16|32|64), whereas
                           // double is stored in a platform-independent, eight-byte
                           // format using float8store()
         string ::= data-length utf8mb4-data
         custom-data ::= custom-type data-length binary-data
         custom-type ::= uint8   // type identifier that matches the
                                 // internal enum_field_types enum
         data-length ::= uint8*  // If the high bit of a byte is 1, the length
                                 // field is continued in the next byte,
                                 // otherwise it is the last byte of the length
                                 // field. So we need 1 byte to represent
                                 // lengths up to 127, 2 bytes to represent
                                 // lengths up to 16383, and so on...
       
      Parameters:
      small - true if the object being read is "small", or false otherwise
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseBoolean

      protected void parseBoolean(JsonFormatter formatter) throws IOException
      Parse a literal value that is either null, true, or false.
      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseInt16

      protected void parseInt16(JsonFormatter formatter) throws IOException
      Parse a 2 byte integer value.
      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseUInt16

      protected void parseUInt16(JsonFormatter formatter) throws IOException
      Parse a 2 byte unsigned integer value.
      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseInt32

      protected void parseInt32(JsonFormatter formatter) throws IOException
      Parse a 4 byte integer value.
      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseUInt32

      protected void parseUInt32(JsonFormatter formatter) throws IOException
      Parse a 4 byte unsigned integer value.
      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseInt64

      protected void parseInt64(JsonFormatter formatter) throws IOException
      Parse a 8 byte integer value.
      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseUInt64

      protected void parseUInt64(JsonFormatter formatter) throws IOException
      Parse a 8 byte unsigned integer value.
      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseDouble

      protected void parseDouble(JsonFormatter formatter) throws IOException
      Parse a 8 byte double value.
      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseString

      protected void parseString(JsonFormatter formatter) throws IOException
      Parse the length and value of a string stored in MySQL's "utf8mb" character set (which equates to Java's UTF-8 character set. The length is a variable length integer length of the string.
      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseOpaque

      protected void parseOpaque(JsonFormatter formatter) throws IOException
      Parse an opaque type. Specific types such as DATE, TIME, and DATETIME values are stored as opaque types, though they are to be unpacked. TIMESTAMPs are also stored as opaque types, but converted by MySQL to DATETIME prior to storage. Other MySQL types are stored as opaque types and passed on to the formatter as opaque values.

      See the MySQL source code for the logic used in this method.

      Grammar

         custom-data ::= custom-type data-length binary-data
         custom-type ::= uint8   // type identifier that matches the
                                 // internal enum_field_types enum
         data-length ::= uint8*  // If the high bit of a byte is 1, the length
                                 // field is continued in the next byte,
                                 // otherwise it is the last byte of the length
                                 // field. So we need 1 byte to represent
                                 // lengths up to 127, 2 bytes to represent
                                 // lengths up to 16383, and so on...
       
      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseDate

      protected void parseDate(JsonFormatter formatter) throws IOException
      Parse a DATE value, which is stored using the same format as DATETIME: 5 bytes + fractional-seconds storage. However, the hour, minute, second, and fractional seconds are ignored.

      The non-fractional part is 40 bits:

        1 bit  sign           (1= non-negative, 0= negative)
        17 bits year*13+month  (year 0-9999, month 0-12)
         5 bits day            (0-31)
         5 bits hour           (0-23)
         6 bits minute         (0-59)
         6 bits second         (0-59)
       
      The fractional part is typically dependent upon the fsp (i.e., fractional seconds part) defined by a column, but in the case of JSON it is always 3 bytes.

      The format of all temporal values is outlined in the MySQL documentation, although since the MySQL JSON type is only available in 5.7, only version 2 of the date-time formats are necessary.

      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseTime

      protected void parseTime(JsonFormatter formatter) throws IOException
      Parse a TIME value, which is stored using the same format as DATETIME: 5 bytes + fractional-seconds storage. However, the year, month, and day values are ignored

      The non-fractional part is 40 bits:

        1 bit  sign           (1= non-negative, 0= negative)
        17 bits year*13+month  (year 0-9999, month 0-12)
         5 bits day            (0-31)
         5 bits hour           (0-23)
         6 bits minute         (0-59)
         6 bits second         (0-59)
       
      The fractional part is typically dependent upon the fsp (i.e., fractional seconds part) defined by a column, but in the case of JSON it is always 3 bytes.

      The format of all temporal values is outlined in the MySQL documentation, although since the MySQL JSON type is only available in 5.7, only version 2 of the date-time formats are necessary.

      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseDatetime

      protected void parseDatetime(JsonFormatter formatter) throws IOException
      Parse a DATETIME value, which is stored as 5 bytes + fractional-seconds storage.

      The non-fractional part is 40 bits:

        1 bit  sign           (1= non-negative, 0= negative)
        17 bits year*13+month  (year 0-9999, month 0-12)
         5 bits day            (0-31)
         5 bits hour           (0-23)
         6 bits minute         (0-59)
         6 bits second         (0-59)
       
      The sign bit is always 1. A value of 0 (negative) is reserved. The fractional part is typically dependent upon the fsp (i.e., fractional seconds part) defined by a column, but in the case of JSON it is always 3 bytes. Unlike the documentation, however, the 8 byte value is in little-endian form.

      The format of all temporal values is outlined in the MySQL documentation, although since the MySQL JSON type is only available in 5.7, only version 2 of the date-time formats are necessary.

      Parameters:
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseDecimal

      protected void parseDecimal(int length, JsonFormatter formatter) throws IOException
      Parse a DECIMAL value. The first two bytes are the precision and scale, followed by the binary representation of the decimal itself.
      Parameters:
      length - the length of the complete binary representation
      formatter - the formatter to be notified of the parsed value; may not be null
      Throws:
      IOException - if there is a problem reading the JSON value
    • parseOpaqueValue

      protected void parseOpaqueValue(ColumnType type, int length, JsonFormatter formatter) throws IOException
      Throws:
      IOException
    • readFractionalSecondsInMicroseconds

      protected int readFractionalSecondsInMicroseconds() throws IOException
      Throws:
      IOException
    • readBigEndianLong

      protected long readBigEndianLong(int numBytes) throws IOException
      Throws:
      IOException
    • readUnsignedIndex

      protected int readUnsignedIndex(int maxValue, boolean isSmall, String desc) throws IOException
      Throws:
      IOException
    • readInt16

      protected int readInt16() throws IOException
      Throws:
      IOException
    • readUInt16

      protected int readUInt16() throws IOException
      Throws:
      IOException
    • readInt24

      protected int readInt24() throws IOException
      Throws:
      IOException
    • readInt32

      protected int readInt32() throws IOException
      Throws:
      IOException
    • readUInt32

      protected long readUInt32() throws IOException
      Throws:
      IOException
    • readInt64

      protected long readInt64() throws IOException
      Throws:
      IOException
    • readUInt64

      protected BigInteger readUInt64() throws IOException
      Throws:
      IOException
    • readVariableInt

      protected int readVariableInt() throws IOException
      Read a variable-length integer value.

      If the high bit of a byte is 1, the length field is continued in the next byte, otherwise it is the last byte of the length field. So we need 1 byte to represent lengths up to 127, 2 bytes to represent lengths up to 16383, and so on...

      Returns:
      the integer value
      Throws:
      IOException - if we don't encounter an end-of-int marker
    • readLiteral

      protected Boolean readLiteral() throws IOException
      Throws:
      IOException
    • readValueType

      protected ValueType readValueType() throws IOException
      Throws:
      IOException
    • asHex

      protected static String asHex(byte b)
    • asHex

      protected static String asHex(int value)