public class GameStats extends LeaderBoard
UserScore, HourlyTeamScore, and LeaderBoard. New concepts:
session windows and finding session duration; use of both singleton and non-singleton side
inputs.
This pipeline builds on the LeaderBoard functionality, and adds some "business
intelligence" analysis: abuse detection and usage patterns. The pipeline derives the Mean user
score sum for a window, and uses that information to identify likely spammers/robots. (The robots
have a higher click rate than the human users). The 'robot' users are then filtered out when
calculating the team scores.
Additionally, user sessions are tracked: that is, we find bursts of user activity using session windows. Then, the mean session duration information is recorded in the context of subsequent fixed windowing. (This could be used to tell us what games are giving us greater user retention).
Run org.apache.beam.examples.complete.game.injector.Injector to generate pubsub data
for this pipeline. The Injector documentation provides more detail.
To execute this pipeline, specify the pipeline configuration like this:
--project=YOUR_PROJECT_ID
--tempLocation=gs://YOUR_TEMP_DIRECTORY
--runner=YOUR_RUNNER
--dataset=YOUR-DATASET
--topic=projects/YOUR-PROJECT/topics/YOUR-TOPIC
The BigQuery dataset you specify must already exist. The PubSub topic you specify should be the same topic to which the Injector is publishing.
| Modifier and Type | Class and Description |
|---|---|
static class |
GameStats.CalculateSpammyUsers
Filter out all users but those with a high clickrate, which we will consider as 'spammy' users.
|
static interface |
GameStats.Options
Options supported by
GameStats. |
UserScore.ExtractAndSumScore| Constructor and Description |
|---|
GameStats() |
| Modifier and Type | Method and Description |
|---|---|
protected static java.util.Map<java.lang.String,WriteToBigQuery.FieldInfo<java.lang.Double>> |
configureSessionWindowWrite()
Create a map of information that describes how to write pipeline output to BigQuery.
|
protected static java.util.Map<java.lang.String,WriteToBigQuery.FieldInfo<org.apache.beam.sdk.values.KV<java.lang.String,java.lang.Integer>>> |
configureWindowedWrite()
Create a map of information that describes how to write pipeline output to BigQuery.
|
static void |
main(java.lang.String[] args) |
configureBigQueryWrite, configureGlobalWindowBigQueryWrite, configureWindowedTableWriteconfigureOutputprotected static java.util.Map<java.lang.String,WriteToBigQuery.FieldInfo<org.apache.beam.sdk.values.KV<java.lang.String,java.lang.Integer>>> configureWindowedWrite()
protected static java.util.Map<java.lang.String,WriteToBigQuery.FieldInfo<java.lang.Double>> configureSessionWindowWrite()
public static void main(java.lang.String[] args)
throws java.lang.Exception
java.lang.Exception