Skip navigation links
B C F G M O P R S W 

B

BytesPairWritable - Class in org.canova.spark.functions.pairdata
A Hadoop writable class for a pair of byte arrays, plus the original URIs (as Strings) of the files they came from
BytesPairWritable() - Constructor for class org.canova.spark.functions.pairdata.BytesPairWritable
 
BytesPairWritable(byte[], byte[], String, String) - Constructor for class org.canova.spark.functions.pairdata.BytesPairWritable
 

C

call(Tuple2<String, PortableDataStream>) - Method in class org.canova.spark.functions.data.FilesAsBytesFunction
 
call(Tuple2<Text, BytesWritable>) - Method in class org.canova.spark.functions.data.RecordReaderBytesFunction
 
call(Tuple2<Text, BytesWritable>) - Method in class org.canova.spark.functions.data.SequenceRecordReaderBytesFunction
 
call(Tuple2<String, Iterable<Tuple3<String, Integer, PortableDataStream>>>) - Method in class org.canova.spark.functions.pairdata.MapToBytesPairWritableFunction
 
call(Tuple2<Text, BytesPairWritable>) - Method in class org.canova.spark.functions.pairdata.PairSequenceRecordReaderBytesFunction
 
call(Tuple2<String, PortableDataStream>) - Method in class org.canova.spark.functions.pairdata.PathToKeyFunction
 
call(Tuple2<String, PortableDataStream>) - Method in class org.canova.spark.functions.RecordReaderFunction
 
call(Tuple2<String, PortableDataStream>) - Method in class org.canova.spark.functions.SequenceRecordReaderFunction
 
CanovaSparkUtil - Class in org.canova.spark.util
Utilities for using Canova with Spark
CanovaSparkUtil() - Constructor for class org.canova.spark.util.CanovaSparkUtil
 
combineFilesForSequenceFile(JavaSparkContext, String, String, PathToKeyConverter) - Static method in class org.canova.spark.util.CanovaSparkUtil
combineFilesForSequenceFile(JavaSparkContext, String, String, PathToKeyConverter, PathToKeyConverter) - Static method in class org.canova.spark.util.CanovaSparkUtil
This is a convenience method to combine data from separate files together (intended to write to a sequence file, using JavaPairRDD.saveAsNewAPIHadoopFile(String, Class, Class, Class))
A typical use case is to combine input and label data from different files, for later parsing by a RecordReader or SequenceRecordReader.

F

FilesAsBytesFunction - Class in org.canova.spark.functions.data
A PairFunction that simply loads bytes[] from a a PortableDataStream, and wraps it (and the String key) in Text and BytesWritable respectively.
FilesAsBytesFunction() - Constructor for class org.canova.spark.functions.data.FilesAsBytesFunction
 

G

getFirst() - Method in class org.canova.spark.functions.pairdata.BytesPairWritable
 
getKey(String) - Method in interface org.canova.spark.functions.pairdata.PathToKeyConverter
Determine the key from the file path
getKey(String) - Method in class org.canova.spark.functions.pairdata.PathToKeyConverterFilename
 
getKey(String) - Method in class org.canova.spark.functions.pairdata.PathToKeyConverterNumber
 
getSecond() - Method in class org.canova.spark.functions.pairdata.BytesPairWritable
 
getUriFirst() - Method in class org.canova.spark.functions.pairdata.BytesPairWritable
 
getUriSecond() - Method in class org.canova.spark.functions.pairdata.BytesPairWritable
 

M

MapToBytesPairWritableFunction - Class in org.canova.spark.functions.pairdata
A function to read files (assuming exactly 2 per input) from a PortableDataStream and combine the contents into a BytesPairWritable
MapToBytesPairWritableFunction() - Constructor for class org.canova.spark.functions.pairdata.MapToBytesPairWritableFunction
 

O

org.canova.spark.functions - package org.canova.spark.functions
 
org.canova.spark.functions.data - package org.canova.spark.functions.data
 
org.canova.spark.functions.pairdata - package org.canova.spark.functions.pairdata
 
org.canova.spark.util - package org.canova.spark.util
 

P

PairSequenceRecordReaderBytesFunction - Class in org.canova.spark.functions.pairdata
SequenceRecordReaderBytesFunction: Converts two sets of binary data (in the form of a BytesPairWritable) to Canova format data (Tuple2<Collection<Collection<<Writable>>,Collection<Collection<Writable>>) using two SequenceRecordReaders.
PairSequenceRecordReaderBytesFunction(SequenceRecordReader, SequenceRecordReader) - Constructor for class org.canova.spark.functions.pairdata.PairSequenceRecordReaderBytesFunction
 
PathToKeyConverter - Interface in org.canova.spark.functions.pairdata
PathToKeyConverter: Used to match up files based on their file names, for PairSequenceRecordReaderBytesFunction For example, suppose we have files "/features_0.csv" and "/labels_0.csv", map both to same key: "0"
PathToKeyConverterFilename - Class in org.canova.spark.functions.pairdata
Convert the path to a key by taking the full file name (excluding the file extension and directories)
PathToKeyConverterFilename() - Constructor for class org.canova.spark.functions.pairdata.PathToKeyConverterFilename
 
PathToKeyConverterNumber - Class in org.canova.spark.functions.pairdata
A PathToKeyConverter that generates a key based on the file name.
PathToKeyConverterNumber() - Constructor for class org.canova.spark.functions.pairdata.PathToKeyConverterNumber
 
PathToKeyFunction - Class in org.canova.spark.functions.pairdata
Given a Tuple2, where the first value is the full path, map this to a Tuple3 where the first value is a key (using a PathToKeyConverter), second is an index, and third is the original data stream
PathToKeyFunction(int, PathToKeyConverter) - Constructor for class org.canova.spark.functions.pairdata.PathToKeyFunction
 

R

readFields(DataInput) - Method in class org.canova.spark.functions.pairdata.BytesPairWritable
 
recordReader - Variable in class org.canova.spark.functions.RecordReaderFunction
 
RecordReaderBytesFunction - Class in org.canova.spark.functions.data
RecordReaderBytesFunction: Converts binary data (in the form of a BytesWritable) to Canova format data (Collection<Writable>) using a RecordReader
RecordReaderBytesFunction(RecordReader) - Constructor for class org.canova.spark.functions.data.RecordReaderBytesFunction
 
RecordReaderFunction - Class in org.canova.spark.functions
RecordReaderFunction: Given a RecordReader and a file (via Spark PortableDataStream), load and parse the data into a Collection
RecordReaderFunction(RecordReader) - Constructor for class org.canova.spark.functions.RecordReaderFunction
 

S

sequenceRecordReader - Variable in class org.canova.spark.functions.SequenceRecordReaderFunction
 
SequenceRecordReaderBytesFunction - Class in org.canova.spark.functions.data
SequenceRecordReaderBytesFunction: Converts binary data (in the form of a BytesWritable) to Canova format data (Collection<Collection<<Writable>>) using a SequenceRecordReader
SequenceRecordReaderBytesFunction(SequenceRecordReader) - Constructor for class org.canova.spark.functions.data.SequenceRecordReaderBytesFunction
 
SequenceRecordReaderFunction - Class in org.canova.spark.functions
RecordReaderFunction: Given a SequenceRecordReader and a file (via Spark PortableDataStream), load and parse the sequence data into a Collection>
SequenceRecordReaderFunction(SequenceRecordReader) - Constructor for class org.canova.spark.functions.SequenceRecordReaderFunction
 
setFirst(byte[]) - Method in class org.canova.spark.functions.pairdata.BytesPairWritable
 
setSecond(byte[]) - Method in class org.canova.spark.functions.pairdata.BytesPairWritable
 
setUriFirst(String) - Method in class org.canova.spark.functions.pairdata.BytesPairWritable
 
setUriSecond(String) - Method in class org.canova.spark.functions.pairdata.BytesPairWritable
 

W

write(DataOutput) - Method in class org.canova.spark.functions.pairdata.BytesPairWritable
 
B C F G M O P R S W 
Skip navigation links

Copyright © 2016. All rights reserved.