Class HourlyTeamScore

  • Direct Known Subclasses:
    LeaderBoard

    public class HourlyTeamScore
    extends UserScore
    This class is the second in a series of four pipelines that tell a story in a 'gaming' domain, following UserScore. In addition to the concepts introduced in UserScore, new concepts include: windowing and element timestamps; use of Filter.by().

    This pipeline processes data collected from gaming events in batch, building on UserScore but using fixed windows. It calculates the sum of scores per team, for each window, optionally allowing specification of two timestamps before and after which data is filtered out. This allows a model where late data collected after the intended analysis window can be included, and any late-arriving data prior to the beginning of the analysis window can be removed as well. By using windowing and adding element timestamps, we can do finer-grained analysis than with the UserScore pipeline. However, our batch processing is high-latency, in that we don't get results from plays at the beginning of the batch's time period until the batch is processed.

    To execute this pipeline, specify the pipeline configuration like this:

    
     --tempLocation=YOUR_TEMP_DIRECTORY
     --runner=YOUR_RUNNER
     --output=YOUR_OUTPUT_DIRECTORY
     (possibly options specific to your runner or permissions for your temp/output locations)
     

    Optionally include --input to specify the batch input file path. To indicate a time after which the data should be filtered out, include the --stopMin arg. E.g., --stopMin=2015-10-18-23-59 indicates that any data timestamped after 23:59 PST on 2015-10-18 should not be included in the analysis. To indicate a time before which data should be filtered out, include the --startMin arg. If you're using the default input specified in UserScore, "gs://apache-beam-samples/game/gaming_data*.csv", then --startMin=2015-11-16-16-10 --stopMin=2015-11-17-16-10 are good values.

    • Constructor Summary

      Constructors 
      Constructor Description
      HourlyTeamScore()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      protected static java.util.Map<java.lang.String,​WriteToText.FieldFn<org.apache.beam.sdk.values.KV<java.lang.String,​java.lang.Integer>>> configureOutput()
      Create a map of information that describes how to write pipeline output to text.
      static void main​(java.lang.String[] args)
      Run a batch pipeline to do windowed analysis of the data.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • HourlyTeamScore

        public HourlyTeamScore()
    • Method Detail

      • configureOutput

        protected static java.util.Map<java.lang.String,​WriteToText.FieldFn<org.apache.beam.sdk.values.KV<java.lang.String,​java.lang.Integer>>> configureOutput()
        Create a map of information that describes how to write pipeline output to text. This map is passed to the WriteToText constructor to write team score sums and includes information about window start time.
      • main

        public static void main​(java.lang.String[] args)
                         throws java.lang.Exception
        Run a batch pipeline to do windowed analysis of the data.
        Throws:
        java.lang.Exception