Packages

object ParquetStreams

Holds factory of Akka Streams / Pekko Streams sources and sinks that allow reading from and writing to Parquet files.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ParquetStreams
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable])
  9. def fromParquet: FromParquet

    Creates a com.github.mjakubowski84.parquet4s.ScalaCompat.stream.scaladsl.Source that reads Parquet data from the specified path.

    Creates a com.github.mjakubowski84.parquet4s.ScalaCompat.stream.scaladsl.Source that reads Parquet data from the specified path. If there are multiple files at path then the order in which files are loaded is determined by underlying filesystem.
    Path can refer to local file, HDFS, AWS S3, Google Storage, Azure, etc. Please refer to Hadoop client documentation or your data provider in order to know how to configure the connection.
    Can read also partitioned directories. Filter applies also to partition values. Partition values are set as fields in read entities at path defined by partition name. Path can be a simple column name or a dot-separated path to nested field. Missing intermediate fields are automatically created for each read record.
    Allows to turn on a projection over original file schema in order to boost read performance if not all columns are required to be read.
    Provides explicit API for both custom data types and generic records.

    returns

    Builder of the source.

  10. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  12. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  13. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  14. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  15. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  16. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  17. def toParquetSingleFile: ToParquet

    Creates a com.github.mjakubowski84.parquet4s.ScalaCompat.stream.scaladsl.Sink that writes Parquet data to single file at the specified path (including file name).

    Creates a com.github.mjakubowski84.parquet4s.ScalaCompat.stream.scaladsl.Sink that writes Parquet data to single file at the specified path (including file name).
    Path can refer to local file, HDFS, AWS S3, Google Storage, Azure, etc. Please refer to Hadoop client documentation or your data provider in order to know how to configure the connection.
    Provides explicit API for both custom data types and generic records.

    returns

    Builder of a sink that writes Parquet file

  18. def toString(): String
    Definition Classes
    AnyRef → Any
  19. def viaParquet: ViaParquet

    Builds a flow that:

    Builds a flow that:

    • Is designed to write Parquet files indefinitely
    • Is able to (optionally) partition data by a list of provided fields
    • Flushes and rotates files after given number of rows is written to the partition or given time period elapses
    • Outputs incoming message after it is written but can write an effect of provided message transformation.
      Provides explicit API for both custom data types and generic records.
    returns

    Builder of the flow.

  20. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  21. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  22. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()

Inherited from AnyRef

Inherited from Any

Ungrouped