Skip navigation links
A C D F G H I M N O P R S T V W 

A

AvroFileCheckpoint(String) - Constructor for class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader.AvroFileCheckpoint
 
AvroFileCheckpoint(long, long) - Constructor for class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader.AvroFileCheckpoint
 
AvroFileHdfsReader - Class in org.apache.samza.system.hdfs.reader
An implementation of the HdfsReader that reads and processes avro format files.
AvroFileHdfsReader(SystemStreamPartition) - Constructor for class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader
 
AvroFileHdfsReader.AvroFileCheckpoint - Class in org.apache.samza.system.hdfs.reader
An avro file looks something like this: Byte offset: 0 103 271 391 ┌────────┬──────────────┬───────────┬───────────┐ Avro file: │ Header │ Block 1 │ Block 2 │ Block 3 │ ...

C

close() - Method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader
 
close() - Method in class org.apache.samza.system.hdfs.reader.MultiFileHdfsReader
 
close() - Method in interface org.apache.samza.system.hdfs.reader.SingleFileHdfsReader
Close the reader.
compareTo(AvroFileHdfsReader.AvroFileCheckpoint) - Method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader.AvroFileCheckpoint
 

D

DirectoryPartitioner - Class in org.apache.samza.system.hdfs.partitioner
The partitioner that takes a directory as an input and does 1.
DirectoryPartitioner(String, String, String, FileSystemAdapter) - Constructor for class org.apache.samza.system.hdfs.partitioner.DirectoryPartitioner
 

F

FileMetadata(String, long) - Constructor for class org.apache.samza.system.hdfs.partitioner.FileSystemAdapter.FileMetadata
 
FileSystemAdapter - Interface in org.apache.samza.system.hdfs.partitioner
An adapter between directory partitioner and the actual file systems or file system like systems.
FileSystemAdapter.FileMetadata - Class in org.apache.samza.system.hdfs.partitioner
 

G

generateCheckpointStr(long, long) - Static method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader.AvroFileCheckpoint
 
generateOffset(int, String) - Static method in class org.apache.samza.system.hdfs.reader.MultiFileHdfsReader
Generate the offset based on file index and offset within single file
getAllFiles(String) - Method in interface org.apache.samza.system.hdfs.partitioner.FileSystemAdapter
Return the list of all files given the stream name
getAllFiles(String) - Method in class org.apache.samza.system.hdfs.partitioner.HdfsFileSystemAdapter
 
getBlockStart() - Method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader.AvroFileCheckpoint
 
getCheckpointStr() - Method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader.AvroFileCheckpoint
 
getCurFileIndex(String) - Static method in class org.apache.samza.system.hdfs.reader.MultiFileHdfsReader
Get the current file index from the offset string
getCurSingleFileOffset(String) - Static method in class org.apache.samza.system.hdfs.reader.MultiFileHdfsReader
Get the offset within file from the offset string
getDescriptorMapFromJson(String) - Static method in class org.apache.samza.system.hdfs.PartitionDescriptorUtil
 
getHdfsReader(HdfsReaderFactory.ReaderType, SystemStreamPartition) - Static method in class org.apache.samza.system.hdfs.reader.HdfsReaderFactory
 
getInputDescriptor(String) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
Gets an HdfsInputDescriptor for the input stream of this system.
getJsonFromDescriptorMap(Map<Partition, List<String>>) - Static method in class org.apache.samza.system.hdfs.PartitionDescriptorUtil
 
getLen() - Method in class org.apache.samza.system.hdfs.partitioner.FileSystemAdapter.FileMetadata
 
getMetricsRegistry() - Method in class org.apache.samza.system.hdfs.HdfsSystemConsumer.HdfsSystemConsumerMetrics
 
getOffsetsAfter(Map<SystemStreamPartition, String>) - Method in class org.apache.samza.system.hdfs.HdfsSystemAdmin
 
getOutputDescriptor(String) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
Gets an HdfsOutputDescriptor for the output stream of this system.
getPartitionDescriptor(String) - Method in class org.apache.samza.system.hdfs.partitioner.DirectoryPartitioner
Get partition descriptors for a stream
getPartitionDescriptorPath(String, String) - Static method in class org.apache.samza.system.hdfs.PartitionDescriptorUtil
 
getPartitionMetadataMap(String, Map<Partition, List<String>>) - Method in class org.apache.samza.system.hdfs.partitioner.DirectoryPartitioner
Get partition metadata for a stream
getPath() - Method in class org.apache.samza.system.hdfs.partitioner.FileSystemAdapter.FileMetadata
 
getRecordOffset() - Method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader.AvroFileCheckpoint
 
getSystemStreamMetadata(Set<String>) - Method in class org.apache.samza.system.hdfs.HdfsSystemAdmin
Fetch metadata from hdfs system for a set of streams.
getSystemStreamPartition() - Method in class org.apache.samza.system.hdfs.reader.MultiFileHdfsReader
 
getType(String) - Static method in class org.apache.samza.system.hdfs.reader.HdfsReaderFactory
 

H

hasNext() - Method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader
 
hasNext() - Method in class org.apache.samza.system.hdfs.reader.MultiFileHdfsReader
 
hasNext() - Method in interface org.apache.samza.system.hdfs.reader.SingleFileHdfsReader
Whether there are still records to be read
HdfsFileSystemAdapter - Class in org.apache.samza.system.hdfs.partitioner
 
HdfsFileSystemAdapter() - Constructor for class org.apache.samza.system.hdfs.partitioner.HdfsFileSystemAdapter
 
HdfsInputDescriptor - Class in org.apache.samza.system.hdfs.descriptors
A HdfsInputDescriptor can be used for specifying Samza and HDFS specific properties of HDFS input streams.
HdfsOutputDescriptor - Class in org.apache.samza.system.hdfs.descriptors
A HdfsOutputDescriptor can be used for specifying Samza and HDFS-specific properties of HDFS output streams.
HdfsReaderFactory - Class in org.apache.samza.system.hdfs.reader
 
HdfsReaderFactory() - Constructor for class org.apache.samza.system.hdfs.reader.HdfsReaderFactory
 
HdfsReaderFactory.ReaderType - Enum in org.apache.samza.system.hdfs.reader
 
HdfsSystemAdmin - Class in org.apache.samza.system.hdfs
The HDFS system admin for HdfsSystemConsumer and HdfsSystemProducer A high level overview of the HDFS producer/consumer architecture: ┌──────────────────────────────────────────────────────────────────────────────┐ │ │ ┌─────────────────┤ HDFS │ │ Obtain │ │ │ Partition └──────┬──────────────────────▲──────┬─────────────────────────────────▲───────┘ │ Descriptors │ │ │ │ │ │ │ │ │ │ ┌─────────────▼───────┐ │ │ Filtering/ │ │ │ │ │ └───┐ Grouping └─────┐ │ │ HDFSAvroFileReader │ │ │ │ │ │ │ Persist │ │ │ │ └─────────┬───────────┘ Partition │ │ │ │ │ Descriptors │ ┌──────▼──────────────┐ ┌──────────┴──────────┐ │ │ │ │ │ │ │ │ ┌─────────┴───────────┐ │ │Directory Partitioner│ │ HDFSAvroWriter │ │ │ IFileReader │ │ │ │ │ │ │ │ │ │ └──────┬──────────────┘ └──────────┬──────────┘ │ └─────────┬───────────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌─────────┴───────────┐ ┌─┴──────────┴────────┐ ┌──────────┴──────────┐ │ │ │ │ │ │ │ │ │ HDFSSystemConsumer │ │ HDFSSystemAdmin │ │ HDFSSystemProducer │ └──────────▶ │ │ │ │ │ └─────────┬───────────┘ └───────────┬─────────┘ └──────────┬──────────┘ │ │ │ └────────────────────────────────────┼────────────────────────────────────┘ │ ┌───────────────────────────────────────┴──────────────────────────────────────┐ │ │ │ HDFSSystemFactory │ │ │ └──────────────────────────────────────────────────────────────────────────────┘
HdfsSystemAdmin(String, Config) - Constructor for class org.apache.samza.system.hdfs.HdfsSystemAdmin
 
HdfsSystemConsumer - Class in org.apache.samza.system.hdfs
The system consumer for HDFS, extending the BlockingEnvelopeMap.
HdfsSystemConsumer(String, Config, HdfsSystemConsumer.HdfsSystemConsumerMetrics) - Constructor for class org.apache.samza.system.hdfs.HdfsSystemConsumer
 
HdfsSystemConsumer.HdfsSystemConsumerMetrics - Class in org.apache.samza.system.hdfs
 
HdfsSystemConsumerMetrics(MetricsRegistry) - Constructor for class org.apache.samza.system.hdfs.HdfsSystemConsumer.HdfsSystemConsumerMetrics
 
HdfsSystemDescriptor - Class in org.apache.samza.system.hdfs.descriptors
A HdfsSystemDescriptor can be used for specifying Samza and HDFS-specific properties of a HDFS input/output system.
HdfsSystemDescriptor(String) - Constructor for class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
 

I

incNumEvents(SystemStreamPartition) - Method in class org.apache.samza.system.hdfs.HdfsSystemConsumer.HdfsSystemConsumerMetrics
 
incTotalNumEvents() - Method in class org.apache.samza.system.hdfs.HdfsSystemConsumer.HdfsSystemConsumerMetrics
 
isStartingOffset() - Method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader.AvroFileCheckpoint
 

M

MultiFileHdfsReader - Class in org.apache.samza.system.hdfs.reader
A wrapper on top of SingleFileHdfsReader to manage the situation of multiple files per partition.
MultiFileHdfsReader(HdfsReaderFactory.ReaderType, SystemStreamPartition, List<String>, String) - Constructor for class org.apache.samza.system.hdfs.reader.MultiFileHdfsReader
 
MultiFileHdfsReader(HdfsReaderFactory.ReaderType, SystemStreamPartition, List<String>, String, int) - Constructor for class org.apache.samza.system.hdfs.reader.MultiFileHdfsReader
 

N

newBlockingQueue() - Method in class org.apache.samza.system.hdfs.HdfsSystemConsumer
 
nextOffset() - Method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader
 
nextOffset() - Method in interface org.apache.samza.system.hdfs.reader.SingleFileHdfsReader
Get the next offset, which is the offset for the next message that will be returned by readNext

O

offsetComparator(String, String) - Method in class org.apache.samza.system.hdfs.HdfsSystemAdmin
Compare two multi-file style offset.
offsetComparator(String, String) - Static method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader
 
offsetComparator(HdfsReaderFactory.ReaderType, String, String) - Static method in class org.apache.samza.system.hdfs.reader.HdfsReaderFactory
 
open(String, String) - Method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader
 
open(String, String) - Method in interface org.apache.samza.system.hdfs.reader.SingleFileHdfsReader
Open the file and seek to specific offset for reading.
org.apache.samza.system.hdfs - package org.apache.samza.system.hdfs
 
org.apache.samza.system.hdfs.descriptors - package org.apache.samza.system.hdfs.descriptors
 
org.apache.samza.system.hdfs.partitioner - package org.apache.samza.system.hdfs.partitioner
 
org.apache.samza.system.hdfs.reader - package org.apache.samza.system.hdfs.reader
 

P

PartitionDescriptorUtil - Class in org.apache.samza.system.hdfs
Util class for methods around partition descriptor.
poll(Set<SystemStreamPartition>, long) - Method in class org.apache.samza.system.hdfs.HdfsSystemConsumer

R

readNext() - Method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader
 
readNext() - Method in class org.apache.samza.system.hdfs.reader.MultiFileHdfsReader
 
readNext() - Method in interface org.apache.samza.system.hdfs.reader.SingleFileHdfsReader
Construct and return the next message envelope
reconnect() - Method in class org.apache.samza.system.hdfs.reader.MultiFileHdfsReader
Reconnect to the file systems in case of failure.
reconnect(String) - Method in class org.apache.samza.system.hdfs.reader.MultiFileHdfsReader
Reconnect to the file systems in case of failures.
register(SystemStreamPartition, String) - Method in class org.apache.samza.system.hdfs.HdfsSystemConsumer
registerSystemStreamPartition(SystemStreamPartition) - Method in class org.apache.samza.system.hdfs.HdfsSystemConsumer.HdfsSystemConsumerMetrics
 

S

seek(String) - Method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader
 
seek(String) - Method in interface org.apache.samza.system.hdfs.reader.SingleFileHdfsReader
Seek to a specific offset
SingleFileHdfsReader - Interface in org.apache.samza.system.hdfs.reader
 
start() - Method in class org.apache.samza.system.hdfs.HdfsSystemConsumer
stop() - Method in class org.apache.samza.system.hdfs.HdfsSystemConsumer

T

toConfig() - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
 
toString() - Method in class org.apache.samza.system.hdfs.partitioner.FileSystemAdapter.FileMetadata
 
toString() - Method in class org.apache.samza.system.hdfs.reader.AvroFileHdfsReader.AvroFileCheckpoint
 

V

valueOf(String) - Static method in enum org.apache.samza.system.hdfs.reader.HdfsReaderFactory.ReaderType
Returns the enum constant of this type with the specified name.
values() - Static method in enum org.apache.samza.system.hdfs.reader.HdfsReaderFactory.ReaderType
Returns an array containing the constants of this enum type, in the order they are declared.

W

withConsumerBlackList(String) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
Black list used by directory partitioner to filter out unwanted files in a hdfs directory.
withConsumerBufferCapacity(long) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
The capacity of the hdfs consumer buffer - the blocking queue used for storing messages.
withConsumerGroupPattern(String) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
Group pattern used by directory partitioner for advanced partitioning.
withConsumerNumMaxRetries(long) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
Number of max retries for the hdfs consumer readers per partition.
withConsumerWhiteList(String) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
White list used by directory partitioner to filter out unwanted files in a hdfs directory.
withDatePathFormat(String) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
In an HdfsWriter implementation that performs time-based output bucketing, the user may configure a date format (suitable for inclusion in a file path) using SimpleDateFormat formatting that the Bucketer implementation will use to generate HDFS paths and filenames.
withOutputBaseDir(String) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
The base output directory into which all HDFS output for this job will be written.
withReaderType(String) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
The type of the file reader for consumer (avro, plain, etc.)
withStagingDirectory(String) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
Staging directory for storing partition description.
withWriteBatchSizeBytes(long) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
Split output files from all writer tasks based on # of bytes written to optimize MapReduce utilization for Hadoop jobs that will process the data later.
withWriteBatchSizeRecords(long) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
Split output files from all writer tasks based on # of bytes written to optimize MapReduce utilization for Hadoop jobs that will process the data later.
withWriteCompressionType(String) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
Simple, human-readable label for various compression options.
withWriterClassName(String) - Method in class org.apache.samza.system.hdfs.descriptors.HdfsSystemDescriptor
The fully-qualified class name of the HdfsWriter subclass that will write for this system.
A C D F G H I M N O P R S T V W 
Skip navigation links