public class DataSampler
extends java.lang.Object
| Constructor and Description |
|---|
DataSampler()
Creates a DataSampler to sample every 1000 elements while keeping a maximum of 10 in memory.
|
DataSampler(int maxSamples,
int sampleEveryN) |
| Modifier and Type | Method and Description |
|---|---|
org.apache.beam.model.fnexecution.v1.BeamFnApi.InstructionResponse.Builder |
handleDataSampleRequest(org.apache.beam.model.fnexecution.v1.BeamFnApi.InstructionRequest request)
Returns all collected samples.
|
<T> OutputSampler<T> |
sampleOutput(java.lang.String pcollectionId,
org.apache.beam.sdk.coders.Coder<T> coder)
Creates and returns a class to sample the given PCollection in the given
ProcessBundleDescriptor.
|
public DataSampler()
public DataSampler(int maxSamples,
int sampleEveryN)
maxSamples - Sets the maximum number of samples held in memory at once.sampleEveryN - Sets how often to sample.public <T> OutputSampler<T> sampleOutput(java.lang.String pcollectionId, org.apache.beam.sdk.coders.Coder<T> coder)
Invoked by multiple bundle processing threads in parallel when a new bundle processor is being instantiated.
T - The type of element contained in the PCollection.pcollectionId - The PCollection to take intermittent samples from.coder - The coder associated with the PCollection. Coder may be from a nested context.public org.apache.beam.model.fnexecution.v1.BeamFnApi.InstructionResponse.Builder handleDataSampleRequest(org.apache.beam.model.fnexecution.v1.BeamFnApi.InstructionRequest request)
request - The instruction request from the FnApi. Filters based on the given
SampleDataRequest.