public class ComputeCooccurrenceMatrixStripes
extends org.apache.hadoop.conf.Configured
implements org.apache.hadoop.util.Tool
Implementation of the "pairs" algorithm for computing co-occurrence matrices from a large text collection. This algorithm is described in Chapter 3 of "Data-Intensive Text Processing with MapReduce" by Lin & Dyer, as well as the following paper:
Jimmy Lin. Scalable Language Processing Algorithms for the Masses: A Case Study in Computing Word Co-occurrence Matrices with MapReduce. Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP 2008), pages 419-428.
| Constructor and Description |
|---|
ComputeCooccurrenceMatrixStripes()
Creates an instance of this tool.
|
| Modifier and Type | Method and Description |
|---|---|
static void |
main(String[] args)
Dispatches command-line arguments to the tool via the
ToolRunner. |
int |
run(String[] args)
Runs this tool.
|
Copyright © 2015. All rights reserved.