object S3GeoTiffRDD
The S3GeoTiffRDD object allows for the creation of whole or windowed RDD[(K, V)]s from files on S3.
- Alphabetic
- By Inheritance
- S3GeoTiffRDD
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Type Members
- case class Options(tiffExtensions: Seq[String] = Seq(".tif", ".TIF", ".tiff", ".TIFF"), crs: Option[CRS] = None, timeTag: String = GEOTIFF_TIME_TAG_DEFAULT, timeFormat: String = GEOTIFF_TIME_FORMAT_DEFAULT, maxTileSize: Option[Int] = Some(DefaultMaxTileSize), numPartitions: Option[Int] = None, partitionBytes: Option[Long] = Some(DefaultPartitionBytes), chunkSize: Option[Int] = None, delimiter: Option[String] = None, getClient: () => S3Client = S3ClientProducer.get) extends RasterReader.Options with Product with Serializable
This case class contains the various parameters one can set when reading RDDs from S3 using Spark.
This case class contains the various parameters one can set when reading RDDs from S3 using Spark.
TODO: Add persistLevel option
- tiffExtensions
Read all file with an extension contained in the given list.
- crs
Override CRS of the input files. If None, the reader will use the file's original CRS.
- timeTag
Name of tiff tag containing the timestamp for the tile.
- timeFormat
Pattern for java.time.format.DateTimeFormatter to parse timeTag.
- maxTileSize
Maximum allowed size of each tiles in output RDD. May result in a one input GeoTiff being split amongst multiple records if it exceeds this size. If no maximum tile size is specific, then each file is broken into 256x256 tiles. If None, then the whole file will be read in. This option is incompatible with numPartitions and anything set to that parameter will be ignored.
- numPartitions
How many partitions Spark should create when it repartitions the data.
- partitionBytes
Desired partition size in bytes, at least one item per partition will be assigned. If no size is specified, then partitions 128 Mb in size will be created by default. This option is incompatible with the numPartitions option. If both are set and maxTileSize isn't, then partitionBytes will be ignored in favor of numPartitions. However, if maxTileSize is set, then partitionBytes will be retained. If None and maxTileSize is defined, then the default partitionBytes' value will still be used. If maxTileSize is also None, then partitionBytes will remain None as well.
- chunkSize
How many bytes should be read in at a time when reading a file. If None, then 65536 byte chunks will be read in at a time.
- delimiter
Delimiter to use for S3 objet listings. This provides a way to further define what files should be read. If None, then only the prefix will be used when determing which files to read.
- getClient
A function to instantiate an S3Client.
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final val GEOTIFF_TIME_FORMAT_DEFAULT: String("yyyy:MM:dd HH:mm:ss")
- final val GEOTIFF_TIME_TAG_DEFAULT: String("TIFFTAG_DATETIME")
- def apply[K, V](bucket: String, prefix: String, options: Options)(implicit sc: SparkContext, rr: RasterReader[Options, (K, V)]): RDD[(K, V)]
Creates a RDD[(K, V)] whose K and V on the type of the GeoTiff that is going to be read in.
Creates a RDD[(K, V)] whose K and V on the type of the GeoTiff that is going to be read in.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- options
An instance of Options that contains any user defined or default settings.
- def apply[I, K, V](bucket: String, prefix: String, uriToKey: (URI, I) => K, options: Options)(implicit sc: SparkContext, rr: RasterReader[Options, (I, V)]): RDD[(K, V)]
Creates a RDD[(K, V)] whose K and V on the type of the GeoTiff that is going to be read in.
Creates a RDD[(K, V)] whose K and V on the type of the GeoTiff that is going to be read in.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- uriToKey
Function to transform input key basing on the URI information.
- options
An instance of Options that contains any user defined or default settings.
- def apply[I, K, V](bucket: String, prefix: String, uriToKey: (URI, I) => K, options: Options, geometry: Option[Geometry])(implicit sc: SparkContext, rr: RasterReader[Options, (I, V)]): RDD[(K, V)]
Creates a RDD[(K, V)] whose K and V on the type of the GeoTiff that is going to be read in.
Creates a RDD[(K, V)] whose K and V on the type of the GeoTiff that is going to be read in.
This function has two modes of operation: When options.maxTileSize is set windows will be read from GeoTiffs and their size and count will be balanced among partitions using partitionBytes option. Resulting partitions will be grouped in relation to GeoTiff segment layout.
When maxTileSize is None the GeoTiffs will be read fully and balanced among partitions using either numPartitions or partitionBytes option.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- uriToKey
Function to transform input key basing on the URI information.
- options
An instance of Options that contains any user defined or default settings.
- geometry
An optional geometry to filter by. If this is provided, it is assumed that all GeoTiffs are in the same CRS, and that this geometry is in that CRS.
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def multiband[K](bucket: String, prefix: String, options: Options)(implicit sc: SparkContext, rr: RasterReader[Options, (K, MultibandTile)]): RDD[(K, MultibandTile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband GeoTiffs.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband GeoTiffs. If a GeoTiff contains multiple bands, only the first will be read.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- def multiband[I, K](bucket: String, prefix: String, uriToKey: (URI, I) => K, options: Options)(implicit sc: SparkContext, rr: RasterReader[Options, (I, MultibandTile)]): RDD[(K, MultibandTile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband GeoTiffs.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband GeoTiffs. If a GeoTiff contains multiple bands, only the first will be read.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- uriToKey
function to transform input key basing on the URI information.
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- def singleband[K](bucket: String, prefix: String, options: Options)(implicit sc: SparkContext, rr: RasterReader[Options, (K, Tile)]): RDD[(K, Tile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband GeoTiffs.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband GeoTiffs. If a GeoTiff contains multiple bands, only the first will be read.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- def singleband[I, K](bucket: String, prefix: String, uriToKey: (URI, I) => K, options: Options)(implicit sc: SparkContext, rr: RasterReader[Options, (I, Tile)]): RDD[(K, Tile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband GeoTiffs.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband GeoTiffs. If a GeoTiff contains multiple bands, only the first will be read.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- uriToKey
function to transform input key basing on the URI information.
- def spatial(bucket: String, prefix: String, uriToKey: (URI, ProjectedExtent) => ProjectedExtent, options: Options)(implicit sc: SparkContext): RDD[(ProjectedExtent, Tile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles. If a GeoTiff contains multiple bands, only the first will be read.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- uriToKey
function to transform input key basing on the URI information.
- options
An instance of Options that contains any user defined or default settings.
- def spatial(bucket: String, prefix: String, options: Options)(implicit sc: SparkContext): RDD[(ProjectedExtent, Tile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles. If a GeoTiff contains multiple bands, only the first will be read.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- options
An instance of Options that contains any user defined or default settings.
- def spatial(bucket: String, prefix: String)(implicit sc: SparkContext): RDD[(ProjectedExtent, Tile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband GeoTiffs.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband GeoTiffs. If a GeoTiff contains multiple bands, only the first will be read.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- def spatialMultiband(bucket: String, prefix: String, uriToKey: (URI, ProjectedExtent) => ProjectedExtent, options: Options)(implicit sc: SparkContext): RDD[(ProjectedExtent, MultibandTile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles. If a GeoTiff contains multiple bands, only the first will be read.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- uriToKey
function to transform input key basing on the URI information.
- options
An instance of Options that contains any user defined or default settings.
- def spatialMultiband(bucket: String, prefix: String, options: Options)(implicit sc: SparkContext): RDD[(ProjectedExtent, MultibandTile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- options
An instance of Options that contains any user defined or default settings.
- def spatialMultiband(bucket: String, prefix: String)(implicit sc: SparkContext): RDD[(ProjectedExtent, MultibandTile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def temporal(bucket: String, prefix: String, uriToKey: (URI, TemporalProjectedExtent) => TemporalProjectedExtent, options: Options)(implicit sc: SparkContext): RDD[(TemporalProjectedExtent, Tile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles. Will parse a timestamp from a tiff tags specified in options to associate with each tile.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- uriToKey
function to transform input key basing on the URI information.
- options
Options for the reading process. Including the timestamp tiff tag and its pattern.
- def temporal(bucket: String, prefix: String, options: Options)(implicit sc: SparkContext): RDD[(TemporalProjectedExtent, Tile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles. Will parse a timestamp from a tiff tags specified in options to associate with each tile.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- options
Options for the reading process. Including the timestamp tiff tag and its pattern.
- def temporal(bucket: String, prefix: String)(implicit sc: SparkContext): RDD[(TemporalProjectedExtent, Tile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles. Will parse a timestamp from the default tiff tags to associate with each file.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- def temporalMultiband(bucket: String, prefix: String, uriToKey: (URI, TemporalProjectedExtent) => TemporalProjectedExtent, options: Options)(implicit sc: SparkContext): RDD[(TemporalProjectedExtent, MultibandTile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles. Will parse a timestamp from a tiff tags specified in options to associate with each tile.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- uriToKey
function to transform input key basing on the URI information.
- options
Options for the reading process. Including the timestamp tiff tag and its pattern.
- def temporalMultiband(bucket: String, prefix: String, options: Options)(implicit sc: SparkContext): RDD[(TemporalProjectedExtent, MultibandTile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles. Will parse a timestamp from a tiff tags specified in options to associate with each tile.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- options
Options for the reading process. Including the timestamp tiff tag and its pattern.
- def temporalMultiband(bucket: String, prefix: String)(implicit sc: SparkContext): RDD[(TemporalProjectedExtent, MultibandTile)]
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.
Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles. Will parse a timestamp from a tiff tags specified in options to associate with each tile.
- bucket
Name of the bucket on S3 where the files are kept.
- prefix
Prefix of all of the keys on S3 that are to be read in.
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- object Options extends Serializable