Class AvroIO.TypedWrite<UserT,DestinationT,OutputT>
- java.lang.Object
-
- org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<UserT>,org.apache.beam.sdk.io.WriteFilesResult<DestinationT>>
-
- org.apache.beam.sdk.extensions.avro.io.AvroIO.TypedWrite<UserT,DestinationT,OutputT>
-
- All Implemented Interfaces:
java.io.Serializable,org.apache.beam.sdk.transforms.display.HasDisplayData
- Enclosing class:
- AvroIO
public abstract static class AvroIO.TypedWrite<UserT,DestinationT,OutputT> extends org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<UserT>,org.apache.beam.sdk.io.WriteFilesResult<DestinationT>>Implementation ofAvroIO.write(java.lang.Class<T>).- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description TypedWrite()
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description org.apache.beam.sdk.io.WriteFilesResult<DestinationT>expand(org.apache.beam.sdk.values.PCollection<UserT> input)voidpopulateDisplayData(org.apache.beam.sdk.transforms.display.DisplayData.Builder builder)AvroIO.TypedWrite<UserT,DestinationT,OutputT>to(java.lang.String outputPrefix)Writes to file(s) with the given output prefix.<NewDestinationT>
AvroIO.TypedWrite<UserT,NewDestinationT,OutputT>to(DynamicAvroDestinations<UserT,NewDestinationT,OutputT> dynamicDestinations)Deprecated.UseFileIO.write()orFileIO.writeDynamic()instead.AvroIO.TypedWrite<UserT,DestinationT,OutputT>to(org.apache.beam.sdk.io.FileBasedSink.FilenamePolicy filenamePolicy)Writes to files named according to the givenFileBasedSink.FilenamePolicy.AvroIO.TypedWrite<UserT,DestinationT,OutputT>to(org.apache.beam.sdk.io.fs.ResourceId outputPrefix)Writes to file(s) with the given output prefix.AvroIO.TypedWrite<UserT,DestinationT,OutputT>to(org.apache.beam.sdk.options.ValueProvider<java.lang.String> outputPrefix)Liketo(String).AvroIO.TypedWrite<UserT,DestinationT,OutputT>toResource(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId> outputPrefix)Liketo(ResourceId).AvroIO.TypedWrite<UserT,DestinationT,OutputT>withBadRecordErrorHandler(org.apache.beam.sdk.transforms.errorhandling.ErrorHandler<org.apache.beam.sdk.transforms.errorhandling.BadRecord,?> errorHandler)SeeFileIO.Write.withBadRecordErrorHandler(ErrorHandler)for details on usage.AvroIO.TypedWrite<UserT,DestinationT,OutputT>withCodec(org.apache.avro.file.CodecFactory codec)Writes to Avro file(s) compressed using specified codec.AvroIO.TypedWrite<UserT,DestinationT,OutputT>withDatumWriterFactory(AvroSink.DatumWriterFactory<OutputT> datumWriterFactory)Specifies aAvroSink.DatumWriterFactoryto use for creatingDatumWriterinstances.AvroIO.TypedWrite<UserT,DestinationT,OutputT>withFormatFunction(@Nullable org.apache.beam.sdk.transforms.SerializableFunction<UserT,OutputT> formatFunction)Specifies a format function to convertAvroIO.TypedWriteto the output type.AvroIO.TypedWrite<UserT,DestinationT,OutputT>withMetadata(java.util.Map<java.lang.String,java.lang.Object> metadata)Writes to Avro file(s) with the specified metadata.AvroIO.TypedWrite<UserT,DestinationT,OutputT>withNoSpilling()SeeWriteFiles.withNoSpilling().AvroIO.TypedWrite<UserT,DestinationT,OutputT>withNumShards(int numShards)Configures the number of output shards produced overall (when using unwindowed writes) or per-window (when using windowed writes).AvroIO.TypedWrite<UserT,DestinationT,OutputT>withoutSharding()Forces a single file as output and empty shard name template.AvroIO.TypedWrite<UserT,DestinationT,OutputT>withSchema(org.apache.avro.Schema schema)Sets the output schema.AvroIO.TypedWrite<UserT,DestinationT,OutputT>withShardNameTemplate(java.lang.String shardTemplate)Uses the givenShardNameTemplatefor naming output files.AvroIO.TypedWrite<UserT,DestinationT,OutputT>withSuffix(java.lang.String filenameSuffix)Configures the filename suffix for written files.AvroIO.TypedWrite<UserT,DestinationT,OutputT>withSyncInterval(int syncInterval)Sets the approximate number of uncompressed bytes to write in each block for the AVRO container format.AvroIO.TypedWrite<UserT,DestinationT,OutputT>withTempDirectory(org.apache.beam.sdk.io.fs.ResourceId tempDirectory)Set the base directory used to generate temporary files.AvroIO.TypedWrite<UserT,DestinationT,OutputT>withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId> tempDirectory)Set the base directory used to generate temporary files.AvroIO.TypedWrite<UserT,DestinationT,OutputT>withWindowedWrites()Preserves windowing of input elements and writes them to files based on the element's window.-
Methods inherited from class org.apache.beam.sdk.transforms.PTransform
addAnnotation, compose, compose, getAdditionalInputs, getAnnotations, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, setDisplayData, setResourceHints, toString, validate, validate
-
-
-
-
Method Detail
-
to
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> to(java.lang.String outputPrefix)
Writes to file(s) with the given output prefix. SeeFileSystemsfor information on supported file systems.The name of the output files will be determined by the
FileBasedSink.FilenamePolicyused.By default, a
DefaultFilenamePolicywill build output filenames using the specified prefix, a shard name template (seewithShardNameTemplate(String), and a common suffix (if supplied usingwithSuffix(String)). This default can be overridden usingto(FilenamePolicy).
-
to
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> to(org.apache.beam.sdk.io.fs.ResourceId outputPrefix)
Writes to file(s) with the given output prefix. SeeFileSystemsfor information on supported file systems. This prefix is used by theDefaultFilenamePolicyto generate filenames.By default, a
DefaultFilenamePolicywill build output filenames using the specified prefix, a shard name template (seewithShardNameTemplate(String), and a common suffix (if supplied usingwithSuffix(String)). This default can be overridden usingto(FilenamePolicy).This default policy can be overridden using
to(FilenamePolicy), in which casewithShardNameTemplate(String)andwithSuffix(String)should not be set. Custom filename policies do not automatically see this prefix - you should explicitly pass the prefix into yourFileBasedSink.FilenamePolicyobject if you need this.If
withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>)has not been called, this filename prefix will be used to infer a directory for temporary files.
-
to
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> to(org.apache.beam.sdk.options.ValueProvider<java.lang.String> outputPrefix)
Liketo(String).
-
toResource
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> toResource(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId> outputPrefix)
Liketo(ResourceId).
-
to
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> to(org.apache.beam.sdk.io.FileBasedSink.FilenamePolicy filenamePolicy)
Writes to files named according to the givenFileBasedSink.FilenamePolicy. A directory for temporary files must be specified usingwithTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>).
-
to
@Deprecated public <NewDestinationT> AvroIO.TypedWrite<UserT,NewDestinationT,OutputT> to(DynamicAvroDestinations<UserT,NewDestinationT,OutputT> dynamicDestinations)
Deprecated.UseFileIO.write()orFileIO.writeDynamic()instead.Use aDynamicAvroDestinationsobject to vendFileBasedSink.FilenamePolicyobjects. These objects can examine the input record when creating aFileBasedSink.FilenamePolicy. A directory for temporary files must be specified usingwithTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>).
-
withSyncInterval
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withSyncInterval(int syncInterval)
Sets the approximate number of uncompressed bytes to write in each block for the AVRO container format.
-
withSchema
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withSchema(org.apache.avro.Schema schema)
Sets the output schema. Can only be used when the output type isGenericRecordand when not usingto(DynamicAvroDestinations).
-
withFormatFunction
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withFormatFunction(@Nullable org.apache.beam.sdk.transforms.SerializableFunction<UserT,OutputT> formatFunction)
Specifies a format function to convertAvroIO.TypedWriteto the output type. Ifto(DynamicAvroDestinations)is used,FileBasedSink.DynamicDestinations.formatRecord(UserT)must be used instead.
-
withTempDirectory
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId> tempDirectory)
Set the base directory used to generate temporary files.
-
withTempDirectory
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withTempDirectory(org.apache.beam.sdk.io.fs.ResourceId tempDirectory)
Set the base directory used to generate temporary files.
-
withShardNameTemplate
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withShardNameTemplate(java.lang.String shardTemplate)
Uses the givenShardNameTemplatefor naming output files. This option may only be used when using one of the default filename-prefix to() overrides.See
DefaultFilenamePolicyfor how the prefix, shard name template, and suffix are used.
-
withSuffix
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withSuffix(java.lang.String filenameSuffix)
Configures the filename suffix for written files. This option may only be used when using one of the default filename-prefix to() overrides.See
DefaultFilenamePolicyfor how the prefix, shard name template, and suffix are used.
-
withNumShards
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withNumShards(int numShards)
Configures the number of output shards produced overall (when using unwindowed writes) or per-window (when using windowed writes).For unwindowed writes, constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
- Parameters:
numShards- the number of shards to use, or 0 to let the system decide.
-
withoutSharding
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withoutSharding()
Forces a single file as output and empty shard name template. This option is only compatible with unwindowed writes.For unwindowed writes, constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
This is equivalent to
.withNumShards(1).withShardNameTemplate("")
-
withWindowedWrites
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withWindowedWrites()
Preserves windowing of input elements and writes them to files based on the element's window.If using
to(FilenamePolicy). Filenames will be generated usingFileBasedSink.FilenamePolicy.windowedFilename(int, int, org.apache.beam.sdk.transforms.windowing.BoundedWindow, org.apache.beam.sdk.transforms.windowing.PaneInfo, org.apache.beam.sdk.io.FileBasedSink.OutputFileHints). See alsoWriteFiles.withWindowedWrites().
-
withNoSpilling
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withNoSpilling()
SeeWriteFiles.withNoSpilling().
-
withCodec
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withCodec(org.apache.avro.file.CodecFactory codec)
Writes to Avro file(s) compressed using specified codec.
-
withDatumWriterFactory
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withDatumWriterFactory(AvroSink.DatumWriterFactory<OutputT> datumWriterFactory)
Specifies aAvroSink.DatumWriterFactoryto use for creatingDatumWriterinstances.
-
withMetadata
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withMetadata(java.util.Map<java.lang.String,java.lang.Object> metadata)
Writes to Avro file(s) with the specified metadata.Supported value types are String, Long, and byte[].
-
withBadRecordErrorHandler
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withBadRecordErrorHandler(org.apache.beam.sdk.transforms.errorhandling.ErrorHandler<org.apache.beam.sdk.transforms.errorhandling.BadRecord,?> errorHandler)
SeeFileIO.Write.withBadRecordErrorHandler(ErrorHandler)for details on usage.
-
expand
public org.apache.beam.sdk.io.WriteFilesResult<DestinationT> expand(org.apache.beam.sdk.values.PCollection<UserT> input)
- Specified by:
expandin classorg.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<UserT>,org.apache.beam.sdk.io.WriteFilesResult<DestinationT>>
-
populateDisplayData
public void populateDisplayData(org.apache.beam.sdk.transforms.display.DisplayData.Builder builder)
- Specified by:
populateDisplayDatain interfaceorg.apache.beam.sdk.transforms.display.HasDisplayData- Overrides:
populateDisplayDatain classorg.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<UserT>,org.apache.beam.sdk.io.WriteFilesResult<DestinationT>>
-
-