Class AbstractCollectionEntityMergeStrategy<KEY extends Serializable>

java.lang.Object
org.onebusaway.gtfs_merge.strategies.AbstractEntityMergeStrategy
org.onebusaway.gtfs_merge.strategies.AbstractCollectionEntityMergeStrategy<KEY>
Type Parameters:
KEY - the type for the id object class that is used to uniquely identify a collection entity
All Implemented Interfaces:
EntityMergeStrategy
Direct Known Subclasses:
ServiceCalendarMergeStrategy, ShapePointMergeStrategy

public abstract class AbstractCollectionEntityMergeStrategy<KEY extends Serializable> extends AbstractEntityMergeStrategy
Abstract base class that defines common methods and properties for merging collection-like GTFS entities. Collection-like entities are entity types where a collection of entries are identified by a common identifier. That includes entities like ShapePoint entries in shapes.txt, where one shapeId identifies a series of shape points. It also includes entries like ServiceCalendar and ServiceCalendarDate entries from calendar.txt and calendar_dates.txt, where one service_id potentially covers multiple calendar entries.
Author:
bdferris
  • Constructor Details

    • AbstractCollectionEntityMergeStrategy

      public AbstractCollectionEntityMergeStrategy(String keyDescription)
  • Method Details

    • merge

      public void merge(GtfsMergeContext context)
      Description copied from interface: EntityMergeStrategy
      Perform a merge operation for the entities specified in the GtfsMergeContext. This method will be called repeated by the GtfsMerger, once for each input feed.
      Parameters:
      context - the merge state for the current merge operation
    • getKeys

      protected abstract Collection<KEY> getKeys(org.onebusaway.gtfs.services.GtfsRelationalDao dao)
      An entity-specific method to determine the set of unique identifiers used by collection entities in the specified GTFS feed.
      Parameters:
      dao -
      Returns:
      the set of unique identifiers
    • pickBestDuplicateDetectionStrategy

      protected EDuplicateDetectionStrategy pickBestDuplicateDetectionStrategy(GtfsMergeContext context)
      Description copied from class: AbstractEntityMergeStrategy
      Determines the best EDuplicateDetectionStrategy to use for merging entities from the current source feed into the merged output feed. Sub-classes are required to provide the most appropriate strategy for merging their particular entity type.
      Specified by:
      pickBestDuplicateDetectionStrategy in class AbstractEntityMergeStrategy
      Parameters:
      context -
      Returns:
    • scoreDuplicateKey

      protected abstract double scoreDuplicateKey(GtfsMergeContext context, KEY key)
      Given an id identifying an entity collection in both the source input feed and the merged output feed, produce a score between 0.0 and 1.0 identifying how likely it is that the two entity collections are one and the same, where 0.0 means they having nothing in common and 1.0 meaning they are exactly the same.
      Parameters:
      context -
      key -
      Returns:
    • getRawKey

      protected String getRawKey(KEY key)
      Converts the entity collection identifier into a raw GTFS identifier string. This is what we actually use for identity duplicate detection.
      Parameters:
      key -
      Returns:
    • renameKey

      protected abstract void renameKey(GtfsMergeContext context, KEY oldId, KEY newId)
      If we detect that an entity collection in the source input feed duplicates an entity collection in the merged output feed, we rename all references to the old id in the source feed to use the id of the entity in the merged feed. That way, when examining other entities in the source feed that referenced the original entity collection with entities in the target feed that reference the duplicate entity, both sets of entity will now appear to reference the same thing. This can be useful for similarity detection.
      Parameters:
      context -
      oldId - the original id in the source input feed
      newId - the new id, which replaces the old in the source input feed
    • saveElementsForKey

      protected abstract void saveElementsForKey(GtfsMergeContext context, KEY key)
      Writes the specified entity collection to the merged output feed.
      Parameters:
      context -
      key - the identifier for the entity collection to save
    • getDescription

      protected String getDescription()
      Specified by:
      getDescription in class AbstractEntityMergeStrategy
      Returns:
      a string description of the current entity merge strategy, typically identifying the entity-type to be merged