|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||
| Class Summary | |
|---|---|
| ChunkedWriter | |
| MailArchivesClusteringAnalyzer | Custom Lucene Analyzer designed for aggressive feature reduction for clustering the ASF Mail Archives using an extended set of stop words, excluding non-alpha-numeric tokens, and porter stemming. |
| PrefixAdditionFilter | Default parser for parsing text into sequence files. |
| SequenceFilesFromCsvFilter | Implements an example csv to sequence file parser. |
| SequenceFilesFromDirectory | Converts a directory of text documents into SequenceFiles of Specified chunkSize. |
| SequenceFilesFromDirectoryFilter | Implement this interface if you wish to extend SequenceFilesFromDirectory with your own parsing logic. |
| SequenceFilesFromMailArchives | Converts a directory of gzipped mail archives into SequenceFiles of specified chunkSize. |
| SequenceFilesFromMailArchives.ChunkedWriter | |
| TextParagraphSplittingJob | |
| TextParagraphSplittingJob.SplitMap | |
|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||