| Class | Description |
|---|---|
| CachingCVB0Mapper |
Run ensemble learning via loading the
ModelTrainer with two TopicModel instances:
one from the previous iteration, the other empty. |
| CachingCVB0PerplexityMapper | |
| CVB0DocInferenceMapper | |
| CVB0Driver |
See
CachingCVB0Mapper for more details on scalability and room for improvement. |
| CVB0Driver.DualDoubleSumReducer |
Sums keys and values independently.
|
| CVB0TopicTermVectorNormalizerMapper |
Performs L1 normalization of input vectors.
|
| InMemoryCollapsedVariationalBayes0 |
Runs the same algorithm as
CVB0Driver, but sequentially, in memory. |
| ModelTrainer |
Multithreaded LDA model trainer class, which primarily operates by running a "map/reduce"
operation, all in memory locally (ie not a hadoop job!) : the "map" operation is to take
the "read-only"
TopicModel and use it to iteratively learn the p(topic|term, doc)
distribution for documents (this can be done in parallel across many documents, as the
"read-only" model is, well, read-only. |
| TopicModel |
Thin wrapper around a
Matrix of counts of occurrences of (topic, term) pairs. |
| Enum | Description |
|---|---|
| CachingCVB0PerplexityMapper.Counters |
Hadoop counters for
CachingCVB0PerplexityMapper, to aid in debugging. |
Copyright © 2008–2017 The Apache Software Foundation. All rights reserved.