- All Implemented Interfaces:
- java.io.Serializable, org.apache.beam.sdk.transforms.display.HasDisplayData
- Enclosing class:
- TfIdf
public static class TfIdf.ComputeTfIdf
extends org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<java.net.URI,java.lang.String>>,org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<java.lang.String,org.apache.beam.sdk.values.KV<java.net.URI,java.lang.Double>>>>
A transform containing a basic TF-IDF pipeline. The input consists of KV objects where the key
is the document's URI and the value is a piece of the document's content. The output is mapping
from terms to scores for each document URI.
- See Also:
- Serialized Form