public abstract static class PubsubIO.Read<T> extends PTransform<PBegin,PCollection<T>>
PubsubIO.read().name| Constructor and Description |
|---|
Read() |
| Modifier and Type | Method and Description |
|---|---|
PCollection<T> |
expand(PBegin input) |
PubsubIO.Read<T> |
fromSubscription(String subscription)
Reads from the given subscription.
|
PubsubIO.Read<T> |
fromSubscription(ValueProvider<String> subscription)
Like
subscription() but with a ValueProvider. |
PubsubIO.Read<T> |
fromTopic(String topic)
Creates and returns a transform for reading from a Cloud Pub/Sub topic.
|
PubsubIO.Read<T> |
fromTopic(ValueProvider<String> topic)
Like
topic() but with a ValueProvider. |
void |
populateDisplayData(DisplayData.Builder builder) |
PubsubIO.Read<T> |
withIdAttribute(String idAttribute)
When reading from Cloud Pub/Sub where unique record identifiers are provided as Pub/Sub
message attributes, specifies the name of the attribute containing the unique identifier.
|
PubsubIO.Read<T> |
withTimestampAttribute(String timestampAttribute)
When reading from Cloud Pub/Sub where record timestamps are provided as Pub/Sub message
attributes, specifies the name of the attribute that contains the timestamp.
|
getAdditionalInputs, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, toString, validatepublic PubsubIO.Read<T> fromSubscription(String subscription)
See PubsubIO.PubsubSubscription.fromPath(String) for more details on the format
of the subscription string.
Multiple readers reading from the same subscription will each receive some arbitrary portion of the data. Most likely, separate readers should use their own subscriptions.
public PubsubIO.Read<T> fromSubscription(ValueProvider<String> subscription)
subscription() but with a ValueProvider.public PubsubIO.Read<T> fromTopic(String topic)
fromSubscription(String).
See PubsubIO.PubsubTopic.fromPath(String) for more details on the format
of the topic string.
The Beam runner will start reading data published on this topic from the time the pipeline is started. Any data published on the topic before the pipeline is started will not be read by the runner.
public PubsubIO.Read<T> fromTopic(ValueProvider<String> topic)
topic() but with a ValueProvider.public PubsubIO.Read<T> withTimestampAttribute(String timestampAttribute)
The timestamp value is expected to be represented in the attribute as either:
Instant.getMillis() returns the correct
value for this attribute.
2015-10-29T23:41:41.123Z. The
sub-second component of the timestamp is optional, and digits beyond the first three
(i.e., time units smaller than milliseconds) will be ignored.
If timestampAttribute is not provided, the system will generate record timestamps
the first time it sees each record. All windowing will be done relative to these
timestamps.
By default, windows are emitted based on an estimate of when this source is likely
done producing data for a given timestamp (referred to as the Watermark; see
AfterWatermark for more details). Any late data will be handled by the trigger
specified with the windowing strategy – by default it will be output immediately.
Note that the system can guarantee that no late data will ever be seen when it assigns
timestamps by arrival time (i.e. timestampAttribute is not provided).
public PubsubIO.Read<T> withIdAttribute(String idAttribute)
Pub/Sub cannot guarantee that no duplicate data will be delivered on the Pub/Sub stream.
If idAttribute is not provided, Beam cannot guarantee that no duplicate data will
be delivered, and deduplication of the stream will be strictly best effort.
public PCollection<T> expand(PBegin input)
expand in class PTransform<PBegin,PCollection<T>>public void populateDisplayData(DisplayData.Builder builder)
populateDisplayData in interface HasDisplayDatapopulateDisplayData in class PTransform<PBegin,PCollection<T>>Copyright © 2016–2017 The Apache Software Foundation. All rights reserved.