public class WindowedWordCount
extends java.lang.Object
This class, WindowedWordCount, is the last in a series of four successively more
detailed 'word count' examples. First take a look at MinimalWordCount, WordCount,
and DebuggingWordCount.
Basic concepts, also in the MinimalWordCount, WordCount, and DebuggingWordCount examples: Reading text files; counting a PCollection; writing to GCS; executing a Pipeline both locally and using a selected runner; defining DoFns; user-defined PTransforms; defining PipelineOptions.
New Concepts:
1. Unbounded and bounded pipeline input modes 2. Adding timestamps to data 3. Windowing 4. Re-using PTransforms over windowed PCollections 5. Accessing the window of an element 6. Writing data to per-window text files
By default, the examples will run with the DirectRunner. To change the runner,
specify:
--runner=YOUR_SELECTED_RUNNER
See examples/java/README.md for instructions about how to configure different runners.
To execute this pipeline locally, specify a local output file (if using the DirectRunner) or output prefix on a supported distributed file system.
--output=[YOUR_LOCAL_FILE | YOUR_OUTPUT_PREFIX]
The input file defaults to a public data set containing the text of King Lear, by William
Shakespeare. You can override it and choose your own input with --inputFile.
By default, the pipeline will do fixed windowing, on 10-minute windows. You can change this
interval by setting the --windowSize parameter, e.g. --windowSize=15 for
15-minute windows.
The example will try to cancel the pipeline on the signal to terminate the process (CTRL-C).
| Modifier and Type | Class and Description |
|---|---|
static class |
WindowedWordCount.DefaultToCurrentSystemTime
A
DefaultValueFactory that returns the current system time. |
static class |
WindowedWordCount.DefaultToMinTimestampPlusOneHour
A
DefaultValueFactory that returns the minimum timestamp plus one hour. |
static interface |
WindowedWordCount.Options
Options for
WindowedWordCount. |
| Constructor and Description |
|---|
WindowedWordCount() |
| Modifier and Type | Method and Description |
|---|---|
static void |
main(java.lang.String[] args) |