SI - input schema typeSO - output schema typeDI - input data typeDO - output data typepublic abstract class Converter<SI,SO,DI,DO> extends Object implements Closeable, FinalState, RecordStreamProcessor<SI,SO,DI,DO>
This interface is responsible for converting both schema and data records. Classes implementing this interface are composible and can be chained together to achieve more complex data transformations.
RecordStreamProcessor.StreamProcessingException| Constructor and Description |
|---|
Converter() |
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
abstract Iterable<DO> |
convertRecord(SO outputSchema,
DI inputRecord,
WorkUnitState workUnit)
Convert an input data record to an
Iterable of output records
conforming to the output schema of convertSchema(SI, org.apache.gobblin.configuration.WorkUnitState). |
protected io.reactivex.Flowable<RecordEnvelope<DO>> |
convertRecordEnvelope(SO outputSchema,
RecordEnvelope<DI> inputRecordEnvelope,
WorkUnitState workUnitState)
Converts a
RecordEnvelope. |
abstract SO |
convertSchema(SI inputSchema,
WorkUnitState workUnit)
Convert an input schema to an output schema.
|
State |
getFinalState()
Get final state for this object.
|
ConverterInitializer |
getInitializer(State state,
WorkUnitStream workUnits,
int branches,
int branchId) |
ControlMessageHandler |
getMessageHandler() |
Converter<SI,SO,DI,DO> |
init(WorkUnitState workUnit)
Initialize this
Converter. |
RecordStreamWithMetadata<DO,SO> |
processStream(RecordStreamWithMetadata<DI,SI> inputStream,
WorkUnitState workUnitState)
Apply conversions to the input
RecordStreamWithMetadata. |
public Converter<SI,SO,DI,DO> init(WorkUnitState workUnit)
Converter.workUnit - a WorkUnitState object carrying configuration propertiesConverter instancepublic void close()
throws IOException
close in interface Closeableclose in interface AutoCloseableIOExceptionpublic abstract SO convertSchema(SI inputSchema, WorkUnitState workUnit) throws SchemaConversionException
Schema conversion is limited to have a 1-to-1 mapping between the input and output schema.
When try to convert avro schema, please call to preserve schame creationTime
inputSchema - input schema to be convertedworkUnit - a WorkUnitState object carrying configuration propertiesSchemaConversionException - if it fails to convert the input schemapublic abstract Iterable<DO> convertRecord(SO outputSchema, DI inputRecord, WorkUnitState workUnit) throws DataConversionException
Iterable of output records
conforming to the output schema of convertSchema(SI, org.apache.gobblin.configuration.WorkUnitState).
Record conversion can have a 1-to-0 mapping, 1-to-1 mapping, or 1-to-many mapping between the input and output records as long as all the output records conforms to the same converted schema.
Converter for data record, both record type conversion, and record manipulation conversion.
outputSchema - output schema converted using the convertSchema(SI, org.apache.gobblin.configuration.WorkUnitState) methodinputRecord - input data record to be convertedworkUnit - a WorkUnitState object carrying configuration propertiesDataConversionException - if it fails to convert the input data recordprotected io.reactivex.Flowable<RecordEnvelope<DO>> convertRecordEnvelope(SO outputSchema, RecordEnvelope<DI> inputRecordEnvelope, WorkUnitState workUnitState) throws DataConversionException
RecordEnvelope. This method can be overridden by implementations that need to manipulate the
RecordEnvelope, such as to set watermarks or metadata.outputSchema - output schema converted using the convertSchema(SI, org.apache.gobblin.configuration.WorkUnitState) methodinputRecordEnvelope - input record envelope with data record to be convertedworkUnitState - a WorkUnitState object carrying configuration propertiesFlowable emitting the converted RecordEnvelopesDataConversionExceptionpublic State getFinalState()
State, but
concrete subclasses can add information that will be added to the task state.getFinalState in interface FinalStateState.public RecordStreamWithMetadata<DO,SO> processStream(RecordStreamWithMetadata<DI,SI> inputStream, WorkUnitState workUnitState) throws SchemaConversionException
RecordStreamWithMetadata.processStream in interface RecordStreamProcessor<SI,SO,DI,DO>SchemaConversionExceptionpublic ControlMessageHandler getMessageHandler()
ControlMessageHandler to call for each ControlMessage received.public ConverterInitializer getInitializer(State state, WorkUnitStream workUnits, int branches, int branchId)