object ParquetStreams
Holds factory of Akka Streams / Pekko Streams sources and sinks that allow reading from and writing to Parquet files.
- Alphabetic
- By Inheritance
- ParquetStreams
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- def fromParquet: FromParquet
Creates a com.github.mjakubowski84.parquet4s.ScalaCompat.stream.scaladsl.Source that reads Parquet data from the specified path.
Creates a com.github.mjakubowski84.parquet4s.ScalaCompat.stream.scaladsl.Source that reads Parquet data from the specified path. If there are multiple files at path then the order in which files are loaded is determined by underlying filesystem.
Path can refer to local file, HDFS, AWS S3, Google Storage, Azure, etc. Please refer to Hadoop client documentation or your data provider in order to know how to configure the connection.
Can read also partitioned directories. Filter applies also to partition values. Partition values are set as fields in read entities at path defined by partition name. Path can be a simple column name or a dot-separated path to nested field. Missing intermediate fields are automatically created for each read record.
Allows to turn on a projection over original file schema in order to boost read performance if not all columns are required to be read.
Provides explicit API for both custom data types and generic records.- returns
Builder of the source.
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toParquetSingleFile: ToParquet
Creates a com.github.mjakubowski84.parquet4s.ScalaCompat.stream.scaladsl.Sink that writes Parquet data to single file at the specified path (including file name).
Creates a com.github.mjakubowski84.parquet4s.ScalaCompat.stream.scaladsl.Sink that writes Parquet data to single file at the specified path (including file name).
Path can refer to local file, HDFS, AWS S3, Google Storage, Azure, etc. Please refer to Hadoop client documentation or your data provider in order to know how to configure the connection.
Provides explicit API for both custom data types and generic records.- returns
Builder of a sink that writes Parquet file
- def toString(): String
- Definition Classes
- AnyRef → Any
- def viaParquet: ViaParquet
Builds a flow that:
Builds a flow that:
- Is designed to write Parquet files indefinitely
- Is able to (optionally) partition data by a list of provided fields
- Flushes and rotates files after given number of rows is written to the partition or given time period elapses
- Outputs incoming message after it is written but
can write an effect of provided message transformation.
Provides explicit API for both custom data types and generic records.
- returns
Builder of the flow.
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()