Class WindowedWordCount


  • public class WindowedWordCount
    extends java.lang.Object
    An example that counts words in text, and can run over either unbounded or bounded input collections.

    This class, WindowedWordCount, is the last in a series of four successively more detailed 'word count' examples. First take a look at MinimalWordCount, WordCount, and DebuggingWordCount.

    Basic concepts, also in the MinimalWordCount, WordCount, and DebuggingWordCount examples: Reading text files; counting a PCollection; writing to GCS; executing a Pipeline both locally and using a selected runner; defining DoFns; user-defined PTransforms; defining PipelineOptions.

    New Concepts:

       1. Unbounded and bounded pipeline input modes
       2. Adding timestamps to data
       3. Windowing
       4. Re-using PTransforms over windowed PCollections
       5. Accessing the window of an element
       6. Writing data to per-window text files
     

    By default, the examples will run with the DirectRunner. To change the runner, specify:

    
     --runner=YOUR_SELECTED_RUNNER
     
    See examples/java/README.md for instructions about how to configure different runners.

    To execute this pipeline locally, specify a local output file (if using the DirectRunner) or output prefix on a supported distributed file system.

    
     --output=[YOUR_LOCAL_FILE | YOUR_OUTPUT_PREFIX]
     

    The input file defaults to a public data set containing the text of King Lear, by William Shakespeare. You can override it and choose your own input with --inputFile.

    By default, the pipeline will do fixed windowing, on 10-minute windows. You can change this interval by setting the --windowSize parameter, e.g. --windowSize=15 for 15-minute windows.

    The example will try to cancel the pipeline on the signal to terminate the process (CTRL-C).

    • Constructor Detail

      • WindowedWordCount

        public WindowedWordCount()
    • Method Detail

      • main

        public static void main​(java.lang.String[] args)
                         throws java.io.IOException
        Throws:
        java.io.IOException