Class AvroIO.ReadFiles<T>
- java.lang.Object
-
- org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.io.FileIO.ReadableFile>,org.apache.beam.sdk.values.PCollection<T>>
-
- org.apache.beam.sdk.extensions.avro.io.AvroIO.ReadFiles<T>
-
- All Implemented Interfaces:
java.io.Serializable,org.apache.beam.sdk.transforms.display.HasDisplayData
- Enclosing class:
- AvroIO
public abstract static class AvroIO.ReadFiles<T> extends org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.io.FileIO.ReadableFile>,org.apache.beam.sdk.values.PCollection<T>>Implementation ofAvroIO.readFiles(java.lang.Class<T>).- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description ReadFiles()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description org.apache.beam.sdk.values.PCollection<T>expand(org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.io.FileIO.ReadableFile> input)voidpopulateDisplayData(org.apache.beam.sdk.transforms.display.DisplayData.Builder builder)AvroIO.ReadFiles<T>withBeamSchemas(boolean withBeamSchemas)If set to true, a Beam schema will be inferred from the AVRO schema.AvroIO.ReadFiles<T>withCoder(org.apache.beam.sdk.coders.Coder<T> coder)Sets a coder for the result of the read function.AvroIO.ReadFiles<T>withDatumReaderFactory(AvroSource.DatumReaderFactory<T> factory)Sets a customAvroSource.DatumReaderFactoryfor reading.AvroIO.ReadFiles<T>withDesiredBundleSizeBytes(long desiredBundleSizeBytes)Set a value for the bundle size for parallel reads.AvroIO.ReadFiles<T>withFileExceptionHandler(org.apache.beam.sdk.io.ReadAllViaFileBasedSource.ReadFileRangesFnExceptionHandler exceptionHandler)Specifies if exceptions should be logged only for streaming pipelines.AvroIO.ReadFiles<T>withUsesReshuffle(boolean usesReshuffle)Specifies if a Reshuffle should run before file reads occur.-
Methods inherited from class org.apache.beam.sdk.transforms.PTransform
addAnnotation, compose, compose, getAdditionalInputs, getAnnotations, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, setDisplayData, setResourceHints, toString, validate, validate
-
-
-
-
Method Detail
-
withDesiredBundleSizeBytes
public AvroIO.ReadFiles<T> withDesiredBundleSizeBytes(long desiredBundleSizeBytes)
Set a value for the bundle size for parallel reads. Default is 64 MB. You may want to use a lower value (e.g. 1 MB) for streaming applications.
-
withUsesReshuffle
public AvroIO.ReadFiles<T> withUsesReshuffle(boolean usesReshuffle)
Specifies if a Reshuffle should run before file reads occur.
-
withFileExceptionHandler
public AvroIO.ReadFiles<T> withFileExceptionHandler(org.apache.beam.sdk.io.ReadAllViaFileBasedSource.ReadFileRangesFnExceptionHandler exceptionHandler)
Specifies if exceptions should be logged only for streaming pipelines.
-
withBeamSchemas
public AvroIO.ReadFiles<T> withBeamSchemas(boolean withBeamSchemas)
If set to true, a Beam schema will be inferred from the AVRO schema. This allows the output to be used by SQL and by the schema-transform library.
-
withCoder
public AvroIO.ReadFiles<T> withCoder(org.apache.beam.sdk.coders.Coder<T> coder)
Sets a coder for the result of the read function.
-
withDatumReaderFactory
public AvroIO.ReadFiles<T> withDatumReaderFactory(AvroSource.DatumReaderFactory<T> factory)
Sets a customAvroSource.DatumReaderFactoryfor reading. Pass aAvroDatumFactoryto also use the factory for the default outputAvroCoder
-
expand
public org.apache.beam.sdk.values.PCollection<T> expand(org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.io.FileIO.ReadableFile> input)
- Specified by:
expandin classorg.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.io.FileIO.ReadableFile>,org.apache.beam.sdk.values.PCollection<T>>
-
populateDisplayData
public void populateDisplayData(org.apache.beam.sdk.transforms.display.DisplayData.Builder builder)
- Specified by:
populateDisplayDatain interfaceorg.apache.beam.sdk.transforms.display.HasDisplayData- Overrides:
populateDisplayDatain classorg.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.io.FileIO.ReadableFile>,org.apache.beam.sdk.values.PCollection<T>>
-
-