Class JsonDeserializer

java.lang.Object
io.trino.hive.formats.line.json.JsonDeserializer
All Implemented Interfaces:
LineDeserializer

public class JsonDeserializer extends Object implements LineDeserializer
Deserializer that is bug for bug compatible with Hive JsonSerDe where possible. Known exceptions are:
  • When a scalar value is actually a json object, Hive will process the open curly bracket for BOOLEAN, DECIMAL, CHAR, VARCHAR, and VARBINARY. Then it continues processing field inside of the json object as if they are part of the outer json object. When the closing curly bracket is encountered it pops a level, which can end parsing early. This is clearly a bug resulting in corrupted data, and instead we throw an exception.
  • Duplicate json object fields are supported, and like Hive. Hive parses each of these duplicate values, but this code only process the last value. This means if one of the duplicates is invalid, Hive will fail, and this code will not.
  • Constructor Details

  • Method Details

    • getTypes

      public List<Type> getTypes()
      Description copied from interface: LineDeserializer
      Required types for the deserialize page builder.
      Specified by:
      getTypes in interface LineDeserializer
    • deserialize

      public void deserialize(LineBuffer lineBuffer, PageBuilder builder) throws IOException
      Description copied from interface: LineDeserializer
      Deserialize the line into the page builder. The implementation will declare the added positions in the page builder. The implementation is allowed to add zero or more positions to the builder.
      Specified by:
      deserialize in interface LineDeserializer
      Parameters:
      lineBuffer - the line which may be empty
      builder - page builder for the declared types
      Throws:
      IOException - if line can not be decoded and processing should stop