| Package | Description |
|---|---|
| org.canova.spark.functions.pairdata | |
| org.canova.spark.util |
| Modifier and Type | Class and Description |
|---|---|
class |
PathToKeyConverterFilename
Convert the path to a key by taking the full file name (excluding the file extension and directories)
|
class |
PathToKeyConverterNumber
A PathToKeyConverter that generates a key based on the file name.
|
| Constructor and Description |
|---|
PathToKeyFunction(int index,
PathToKeyConverter converter) |
| Modifier and Type | Method and Description |
|---|---|
static org.apache.spark.api.java.JavaPairRDD<org.apache.hadoop.io.Text,BytesPairWritable> |
CanovaSparkUtil.combineFilesForSequenceFile(org.apache.spark.api.java.JavaSparkContext sc,
String path1,
String path2,
PathToKeyConverter converter)
Same as
CanovaSparkUtil.combineFilesForSequenceFile(JavaSparkContext, String, String, PathToKeyConverter, PathToKeyConverter)
but with the PathToKeyConverter used for both file sources |
static org.apache.spark.api.java.JavaPairRDD<org.apache.hadoop.io.Text,BytesPairWritable> |
CanovaSparkUtil.combineFilesForSequenceFile(org.apache.spark.api.java.JavaSparkContext sc,
String path1,
String path2,
PathToKeyConverter converter1,
PathToKeyConverter converter2)
This is a convenience method to combine data from separate files together (intended to write to a sequence file, using
JavaPairRDD.saveAsNewAPIHadoopFile(String, Class, Class, Class))A typical use case is to combine input and label data from different files, for later parsing by a RecordReader or SequenceRecordReader. |
Copyright © 2016. All rights reserved.