Class DebuggingWordCount
- java.lang.Object
-
- org.apache.beam.examples.DebuggingWordCount
-
public class DebuggingWordCount extends java.lang.ObjectAn example that verifies word counts in Shakespeare and includes Beam best practices.This class,
DebuggingWordCount, is the third in a series of four successively more detailed 'word count' examples. You may first want to take a look atMinimalWordCountandWordCount. After you've looked at this example, then see theWindowedWordCountpipeline, for introduction of additional concepts.Basic concepts, also in the MinimalWordCount and WordCount examples: Reading text files; counting a PCollection; executing a Pipeline both locally and using a selected runner; defining DoFns.
New Concepts:
1. Logging using SLF4J, even in a distributed environment 2. Creating a custom metric (runners have varying levels of support) 3. Testing your Pipeline via PAssert
To execute this pipeline locally, specify general pipeline configuration:
--project=YOUR_PROJECT_IDTo change the runner, specify:
--runner=YOUR_SELECTED_RUNNERThe input file defaults to a public data set containing the text of King Lear, by William Shakespeare. You can override it and choose your own input with
--inputFile.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classDebuggingWordCount.FilterTextFnA DoFn that filters for a specific key based upon a regular expression.static interfaceDebuggingWordCount.WordCountOptionsOptions supported byDebuggingWordCount.
-
Constructor Summary
Constructors Constructor Description DebuggingWordCount()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static voidmain(java.lang.String[] args)
-