Class AbstractCollectionEntityMergeStrategy<KEY extends Serializable>
java.lang.Object
org.onebusaway.gtfs_merge.strategies.AbstractEntityMergeStrategy
org.onebusaway.gtfs_merge.strategies.AbstractCollectionEntityMergeStrategy<KEY>
- Type Parameters:
KEY- the type for the id object class that is used to uniquely identify a collection entity
- All Implemented Interfaces:
EntityMergeStrategy
- Direct Known Subclasses:
ServiceCalendarMergeStrategy,ShapePointMergeStrategy
public abstract class AbstractCollectionEntityMergeStrategy<KEY extends Serializable>
extends AbstractEntityMergeStrategy
Abstract base class that defines common methods and properties for merging collection-like GTFS
entities. Collection-like entities are entity types where a collection of entries are identified
by a common identifier. That includes entities like
ShapePoint entries in shapes.txt,
where one shapeId identifies a series of shape points. It also includes entries like
ServiceCalendar and ServiceCalendarDate entries from calendar.txt and
calendar_dates.txt, where one service_id potentially covers multiple calendar entries.- Author:
- bdferris
-
Field Summary
Fields inherited from class org.onebusaway.gtfs_merge.strategies.AbstractEntityMergeStrategy
_duplicateDetectionStrategy, _logDuplicatesStrategy, _minElementDuplicateScoreForFuzzyMatch, _minElementsDuplicateScoreForAutoDetect, _minElementsInCommonScoreForAutoDetect -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected Stringprotected abstract Collection<KEY> getKeys(org.onebusaway.gtfs.services.GtfsRelationalDao dao) An entity-specific method to determine the set of unique identifiers used by collection entities in the specified GTFS feed.protected StringConverts the entity collection identifier into a raw GTFS identifier string.voidmerge(GtfsMergeContext context) Perform a merge operation for the entities specified in theGtfsMergeContext.protected EDuplicateDetectionStrategyDetermines the bestEDuplicateDetectionStrategyto use for merging entities from the current source feed into the merged output feed.protected abstract voidrenameKey(GtfsMergeContext context, KEY oldId, KEY newId) If we detect that an entity collection in the source input feed duplicates an entity collection in the merged output feed, we rename all references to the old id in the source feed to use the id of the entity in the merged feed.protected abstract voidsaveElementsForKey(GtfsMergeContext context, KEY key) Writes the specified entity collection to the merged output feed.protected abstract doublescoreDuplicateKey(GtfsMergeContext context, KEY key) Given an id identifying an entity collection in both the source input feed and the merged output feed, produce a score between 0.0 and 1.0 identifying how likely it is that the two entity collections are one and the same, where 0.0 means they having nothing in common and 1.0 meaning they are exactly the same.Methods inherited from class org.onebusaway.gtfs_merge.strategies.AbstractEntityMergeStrategy
determineDuplicateDetectionStrategy, getDuplicateRenamingStrategy, setDuplicateDetectionStrategy, setDuplicateRenamingStrategy, setLogDuplicatesStrategyMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.onebusaway.gtfs_merge.strategies.EntityMergeStrategy
getEntityTypes
-
Constructor Details
-
AbstractCollectionEntityMergeStrategy
-
-
Method Details
-
merge
Description copied from interface:EntityMergeStrategyPerform a merge operation for the entities specified in theGtfsMergeContext. This method will be called repeated by theGtfsMerger, once for each input feed.- Parameters:
context- the merge state for the current merge operation
-
getKeys
An entity-specific method to determine the set of unique identifiers used by collection entities in the specified GTFS feed.- Parameters:
dao-- Returns:
- the set of unique identifiers
-
pickBestDuplicateDetectionStrategy
Description copied from class:AbstractEntityMergeStrategyDetermines the bestEDuplicateDetectionStrategyto use for merging entities from the current source feed into the merged output feed. Sub-classes are required to provide the most appropriate strategy for merging their particular entity type.- Specified by:
pickBestDuplicateDetectionStrategyin classAbstractEntityMergeStrategy- Parameters:
context-- Returns:
-
scoreDuplicateKey
Given an id identifying an entity collection in both the source input feed and the merged output feed, produce a score between 0.0 and 1.0 identifying how likely it is that the two entity collections are one and the same, where 0.0 means they having nothing in common and 1.0 meaning they are exactly the same.- Parameters:
context-key-- Returns:
-
getRawKey
Converts the entity collection identifier into a raw GTFS identifier string. This is what we actually use for identity duplicate detection.- Parameters:
key-- Returns:
-
renameKey
If we detect that an entity collection in the source input feed duplicates an entity collection in the merged output feed, we rename all references to the old id in the source feed to use the id of the entity in the merged feed. That way, when examining other entities in the source feed that referenced the original entity collection with entities in the target feed that reference the duplicate entity, both sets of entity will now appear to reference the same thing. This can be useful for similarity detection.- Parameters:
context-oldId- the original id in the source input feednewId- the new id, which replaces the old in the source input feed
-
saveElementsForKey
Writes the specified entity collection to the merged output feed.- Parameters:
context-key- the identifier for the entity collection to save
-
getDescription
- Specified by:
getDescriptionin classAbstractEntityMergeStrategy- Returns:
- a string description of the current entity merge strategy, typically identifying the entity-type to be merged
-