Class

za.co.absa.abris.avro.AvroSerDe

Serializer

Related Doc: package AvroSerDe

Permalink

implicit class Serializer extends AnyRef

This class provides methods to perform the translation from Dataframe Rows into Avro records on the fly.

Users can either, inform the path to the destination Avro schema or inform record name and namespace and the schema will be inferred from the Dataframe.

The methods are "storage-agnostic", which means the provide Dataframes of Avro records which can be stored into any sink (e.g. Kafka, Parquet, etc).

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Serializer
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Serializer(dataframe: Dataset[Row])

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  13. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  14. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  15. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  16. def toAvro(schemaName: String, schemaNamespace: String): Dataset[Array[Byte]]

    Permalink

    Converts from Dataset[Row] into Dataset[Array[Byte]] containing Avro records.

    Converts from Dataset[Row] into Dataset[Array[Byte]] containing Avro records.

    Intended to be used when there is a Spark schema present in the Dataframe from which the Avro schema will be translated.

    The API will infer the Avro schema from the incoming Dataframe. The inferred schema will receive the name and namespace informed as parameters.

    The API will throw in case the Dataframe does not have a schema.

    Differently than the other API, this one does not suffer from the schema changing issue, since the final Avro schema will be derived from the schema already used by Spark.

  17. def toAvro(schemaPath: String): Dataset[Array[Byte]]

    Permalink

    Converts from Dataset[Row] into Dataset[Array[Byte]] containing Avro records.

    Converts from Dataset[Row] into Dataset[Array[Byte]] containing Avro records.

    Intended to be used when there is not Spark schema available in the Dataframe but there is an expected Avro schema.

    It is important to keep in mind that the specification for a field in the schema MUST be the same at both ends, writer and reader. For some fields (e.g. strings), Spark can ignore the nullability specified in the SQL struct (SPARK-14139). This issue could lead to fields being ignored. Thus, it is important to check the final SQL schema after Spark has created the Dataframes.

    For instance, the Spark construct 'StructType("name", StringType, false)' translates to the Avro field {"name": "name", "type":"string"}. However, if Spark changes the nullability (StructType("name", StringType, TRUE)), the Avro field becomes a union: {"name":"name", "type": ["string", "null"]}.

    The difference in the specifications will prevent the field from being correctly loaded by Avro readers, leading to data loss.

  18. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  19. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  20. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  21. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped