Class AvroIO.TypedWrite<UserT,​DestinationT,​OutputT>

  • All Implemented Interfaces:
    java.io.Serializable, org.apache.beam.sdk.transforms.display.HasDisplayData
    Enclosing class:
    AvroIO

    public abstract static class AvroIO.TypedWrite<UserT,​DestinationT,​OutputT>
    extends org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<UserT>,​org.apache.beam.sdk.io.WriteFilesResult<DestinationT>>
    See Also:
    Serialized Form
    • Constructor Detail

      • TypedWrite

        public TypedWrite()
    • Method Detail

      • to

        public AvroIO.TypedWrite<UserT,​DestinationT,​OutputT> to​(java.lang.String outputPrefix)
        Writes to file(s) with the given output prefix. See FileSystems for information on supported file systems.

        The name of the output files will be determined by the FileBasedSink.FilenamePolicy used.

        By default, a DefaultFilenamePolicy will build output filenames using the specified prefix, a shard name template (see withShardNameTemplate(String), and a common suffix (if supplied using withSuffix(String)). This default can be overridden using to(FilenamePolicy).

      • withSyncInterval

        public AvroIO.TypedWrite<UserT,​DestinationT,​OutputT> withSyncInterval​(int syncInterval)
        Sets the approximate number of uncompressed bytes to write in each block for the AVRO container format.
      • withTempDirectory

        public AvroIO.TypedWrite<UserT,​DestinationT,​OutputT> withTempDirectory​(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId> tempDirectory)
        Set the base directory used to generate temporary files.
      • withTempDirectory

        public AvroIO.TypedWrite<UserT,​DestinationT,​OutputT> withTempDirectory​(org.apache.beam.sdk.io.fs.ResourceId tempDirectory)
        Set the base directory used to generate temporary files.
      • withShardNameTemplate

        public AvroIO.TypedWrite<UserT,​DestinationT,​OutputT> withShardNameTemplate​(java.lang.String shardTemplate)
        Uses the given ShardNameTemplate for naming output files. This option may only be used when using one of the default filename-prefix to() overrides.

        See DefaultFilenamePolicy for how the prefix, shard name template, and suffix are used.

      • withSuffix

        public AvroIO.TypedWrite<UserT,​DestinationT,​OutputT> withSuffix​(java.lang.String filenameSuffix)
        Configures the filename suffix for written files. This option may only be used when using one of the default filename-prefix to() overrides.

        See DefaultFilenamePolicy for how the prefix, shard name template, and suffix are used.

      • withNumShards

        public AvroIO.TypedWrite<UserT,​DestinationT,​OutputT> withNumShards​(int numShards)
        Configures the number of output shards produced overall (when using unwindowed writes) or per-window (when using windowed writes).

        For unwindowed writes, constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.

        Parameters:
        numShards - the number of shards to use, or 0 to let the system decide.
      • withoutSharding

        public AvroIO.TypedWrite<UserT,​DestinationT,​OutputT> withoutSharding()
        Forces a single file as output and empty shard name template. This option is only compatible with unwindowed writes.

        For unwindowed writes, constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.

        This is equivalent to .withNumShards(1).withShardNameTemplate("")

      • withWindowedWrites

        public AvroIO.TypedWrite<UserT,​DestinationT,​OutputT> withWindowedWrites()
        Preserves windowing of input elements and writes them to files based on the element's window.

        If using to(FilenamePolicy). Filenames will be generated using FileBasedSink.FilenamePolicy.windowedFilename(int, int, org.apache.beam.sdk.transforms.windowing.BoundedWindow, org.apache.beam.sdk.transforms.windowing.PaneInfo, org.apache.beam.sdk.io.FileBasedSink.OutputFileHints). See also WriteFiles.withWindowedWrites().

      • withMetadata

        public AvroIO.TypedWrite<UserT,​DestinationT,​OutputT> withMetadata​(java.util.Map<java.lang.String,​java.lang.Object> metadata)
        Writes to Avro file(s) with the specified metadata.

        Supported value types are String, Long, and byte[].

      • withBadRecordErrorHandler

        public AvroIO.TypedWrite<UserT,​DestinationT,​OutputT> withBadRecordErrorHandler​(org.apache.beam.sdk.transforms.errorhandling.ErrorHandler<org.apache.beam.sdk.transforms.errorhandling.BadRecord,​?> errorHandler)
        See FileIO.Write.withBadRecordErrorHandler(ErrorHandler) for details on usage.
      • expand

        public org.apache.beam.sdk.io.WriteFilesResult<DestinationT> expand​(org.apache.beam.sdk.values.PCollection<UserT> input)
        Specified by:
        expand in class org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<UserT>,​org.apache.beam.sdk.io.WriteFilesResult<DestinationT>>
      • populateDisplayData

        public void populateDisplayData​(org.apache.beam.sdk.transforms.display.DisplayData.Builder builder)
        Specified by:
        populateDisplayData in interface org.apache.beam.sdk.transforms.display.HasDisplayData
        Overrides:
        populateDisplayData in class org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<UserT>,​org.apache.beam.sdk.io.WriteFilesResult<DestinationT>>