| Package | Description |
|---|---|
| org.canova.spark.functions.pairdata | |
| org.canova.spark.util |
| Modifier and Type | Method and Description |
|---|---|
scala.Tuple2<org.apache.hadoop.io.Text,BytesPairWritable> |
MapToBytesPairWritableFunction.call(scala.Tuple2<String,Iterable<scala.Tuple3<String,Integer,org.apache.spark.input.PortableDataStream>>> in) |
| Modifier and Type | Method and Description |
|---|---|
scala.Tuple2<Collection<Collection<Writable>>,Collection<Collection<Writable>>> |
PairSequenceRecordReaderBytesFunction.call(scala.Tuple2<org.apache.hadoop.io.Text,BytesPairWritable> v1) |
| Modifier and Type | Method and Description |
|---|---|
static org.apache.spark.api.java.JavaPairRDD<org.apache.hadoop.io.Text,BytesPairWritable> |
CanovaSparkUtil.combineFilesForSequenceFile(org.apache.spark.api.java.JavaSparkContext sc,
String path1,
String path2,
PathToKeyConverter converter)
Same as
CanovaSparkUtil.combineFilesForSequenceFile(JavaSparkContext, String, String, PathToKeyConverter, PathToKeyConverter)
but with the PathToKeyConverter used for both file sources |
static org.apache.spark.api.java.JavaPairRDD<org.apache.hadoop.io.Text,BytesPairWritable> |
CanovaSparkUtil.combineFilesForSequenceFile(org.apache.spark.api.java.JavaSparkContext sc,
String path1,
String path2,
PathToKeyConverter converter1,
PathToKeyConverter converter2)
This is a convenience method to combine data from separate files together (intended to write to a sequence file, using
JavaPairRDD.saveAsNewAPIHadoopFile(String, Class, Class, Class))A typical use case is to combine input and label data from different files, for later parsing by a RecordReader or SequenceRecordReader. |
Copyright © 2016. All rights reserved.