public class TfIdf
extends java.lang.Object
Concepts: joining data; side inputs; logging
To execute this pipeline locally, specify a local output file or output prefix on GCS:
--output=[YOUR_LOCAL_FILE | gs://YOUR_OUTPUT_PREFIX]
To change the runner, specify:
--runner=YOUR_SELECTED_RUNNER
See examples/java/README.md for instructions about how to configure different runners.
The default input is gs://apache-beam-samples/shakespeare/ and can be overridden with
--input.
| Modifier and Type | Class and Description |
|---|---|
static class |
TfIdf.ComputeTfIdf
A transform containing a basic TF-IDF pipeline.
|
static interface |
TfIdf.Options
Options supported by
TfIdf. |
static class |
TfIdf.ReadDocuments
Reads the documents at the provided uris and returns all lines from the documents tagged with
which document they are from.
|
static class |
TfIdf.WriteTfIdf
A
PTransform to write, in CSV format, a mapping from term and URI to score. |
| Constructor and Description |
|---|
TfIdf() |
| Modifier and Type | Method and Description |
|---|---|
static java.util.Set<java.net.URI> |
listInputDocuments(TfIdf.Options options)
Lists documents contained beneath the
options.input prefix/directory. |
static void |
main(java.lang.String[] args) |
public static java.util.Set<java.net.URI> listInputDocuments(TfIdf.Options options) throws java.net.URISyntaxException, java.io.IOException
options.input prefix/directory.java.net.URISyntaxExceptionjava.io.IOExceptionpublic static void main(java.lang.String[] args)
throws java.lang.Exception
java.lang.Exception