public class SourceState extends State
Properties can be overwritten at runtime and persisted upon job completion. Persisted
properties will be loaded in the next run and made available to use by the
Source.
| Constructor and Description |
|---|
SourceState()
Default constructor.
|
SourceState(State properties)
Constructor.
|
SourceState(State properties,
Iterable<WorkUnitState> previousWorkUnitStates)
Constructor.
|
SourceState(State properties,
Map<String,? extends SourceState> previousDatasetStatesByUrns,
Iterable<WorkUnitState> previousWorkUnitStates)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
Extract |
createExtract(Extract.TableType type,
String namespace,
String table)
Deprecated.
|
WorkUnit |
createWorkUnit(Extract extract)
Deprecated.
Properties in SourceState should not added to a WorkUnit. Having each WorkUnit contain a copy of
SourceState is a waste of memory. Use
WorkUnit.create(Extract). |
boolean |
equals(Object object) |
SourceState |
getPreviousDatasetState(String datasetUrn)
Get the state (in the form of a
SourceState) of a dataset identified by a dataset URN
of the previous job run. |
Map<String,SourceState> |
getPreviousDatasetStatesByUrns() |
SourceState |
getPreviousSourceState()
Get the
SourceState of the previous job run. |
List<WorkUnitState> |
getPreviousWorkUnitStates()
Get a
List of previous WorkUnitStates. |
List<WorkUnitState> |
getPreviousWorkUnitStates(String datasetUrn)
Get a
List of previous WorkUnitStates for a given datasetUrn. |
Map<String,Iterable<WorkUnitState>> |
getPreviousWorkUnitStatesByDatasetUrns()
Get a
Map from dataset URNs (as being specified by ConfigurationKeys.DATASET_URN_KEY
to the WorkUnitState with the dataset URNs. |
int |
hashCode() |
void |
readFields(DataInput in) |
void |
write(DataOutput out) |
void |
write(DataOutput out,
boolean writePreviousWorkUnitStates) |
addAll, addAll, addAllIfNotExist, addAllIfNotExist, appendToListProp, appendToSetProp, contains, getId, getProp, getProp, getPropAsBoolean, getPropAsBoolean, getPropAsCaseInsensitiveSet, getPropAsCaseInsensitiveSet, getPropAsDouble, getPropAsDouble, getPropAsInt, getPropAsInt, getPropAsJsonArray, getPropAsList, getPropAsList, getPropAsLong, getPropAsLong, getPropAsSet, getPropAsSet, getPropAsShort, getPropAsShort, getPropAsShortWithRadix, getPropAsShortWithRadix, getProperties, getProperty, getProperty, getPropertyNames, overrideWith, overrideWith, removeProp, removePropsWithPrefix, setId, setProp, setProps, toStringpublic SourceState()
public SourceState(State properties)
properties - job configuration propertiespublic SourceState(State properties, Iterable<WorkUnitState> previousWorkUnitStates)
properties - job configuration propertiespreviousWorkUnitStates - an Iterable of WorkUnitStates of the previous job runpublic SourceState(State properties, Map<String,? extends SourceState> previousDatasetStatesByUrns, Iterable<WorkUnitState> previousWorkUnitStates)
properties - job configuration propertiespreviousDatasetStatesByUrns - SourceState of the previous job runpreviousWorkUnitStates - an Iterable of WorkUnitStates of the previous job runpublic SourceState getPreviousSourceState()
SourceState of the previous job run.
This is a convenient method for existing jobs that do not use the new feature that allows output data to
be committed on a per-dataset basis. Use of this method assumes that the job deals with a single dataset,
which uses the default data URN defined by ConfigurationKeys.DEFAULT_DATASET_URN.
SourceState of the previous job run or null if no previous SourceState is foundpublic SourceState getPreviousDatasetState(String datasetUrn)
SourceState) of a dataset identified by a dataset URN
of the previous job run. Useful when dataset state store is enabled and we want to load the latest
state of a global watermark dataset.datasetUrn - the dataset URNSourceState) of the previous job run
or null if no previous dataset state is found for the given dataset URNpublic Map<String,SourceState> getPreviousDatasetStatesByUrns()
Map from dataset URNs (as being specified by ConfigurationKeys.DATASET_URN_KEY
to the SourceState with the dataset URNs. The map is materialized upon invocation of the method
by the source. Subsequent calls to this method will return the previously materialized map.
SourceStates that do not have ConfigurationKeys.DATASET_URN_KEY set will be added
to the dataset state belonging to ConfigurationKeys.DEFAULT_DATASET_URN.
public List<WorkUnitState> getPreviousWorkUnitStates()
List of previous WorkUnitStates. The list is lazily materialized upon invocation of the
method by the Source. Subsequent calls to this method will return the previously
materialized map.public List<WorkUnitState> getPreviousWorkUnitStates(String datasetUrn)
List of previous WorkUnitStates for a given datasetUrn.datasetUrn - List of WorkUnitStates.public Map<String,Iterable<WorkUnitState>> getPreviousWorkUnitStatesByDatasetUrns()
Map from dataset URNs (as being specified by ConfigurationKeys.DATASET_URN_KEY
to the WorkUnitState with the dataset URNs.
WorkUnitStates that do not have ConfigurationKeys.DATASET_URN_KEY set will be added
to the dataset state belonging to ConfigurationKeys.DEFAULT_DATASET_URN.
Map from dataset URNs to the WorkUnitState with the dataset URNs@Deprecated public Extract createExtract(Extract.TableType type, String namespace, String table)
Extract instance.
This method should always return a new unique Extract instance.
type - Extract.TableTypenamespace - namespace of the table this extract belongs totable - name of the table this extract belongs toExtract instance@Deprecated public WorkUnit createWorkUnit(Extract extract)
WorkUnit.create(Extract).public void write(DataOutput out) throws IOException
write in interface WritableShimwrite in class StateIOExceptionpublic void write(DataOutput out, boolean writePreviousWorkUnitStates) throws IOException
IOExceptionpublic void readFields(DataInput in) throws IOException
readFields in interface WritableShimreadFields in class StateIOException