Class GriDBSCAN.Instance<V extends elki.data.NumberVector>

  • Type Parameters:
    V - Vector type
    Enclosing class:
    GriDBSCAN<V extends elki.data.NumberVector>

    protected static class GriDBSCAN.Instance<V extends elki.data.NumberVector>
    extends java.lang.Object
    Instance, for a single run.
    Author:
    Erich Schubert
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private Border[] borders
      Border identifier objects (shared to conserve memory).
      protected int[] cells
      Number of cells per dimension.
      private elki.database.datastore.WritableDataStore<Assignment> clusterids
      Cluster assignments.
      private Core[] cores
      Core identifier objects (shared to conserve memory).
      protected int dim
      Dimensionality.
      protected elki.distance.Distance<? super V> distance
      Distance function used.
      protected double[][] domain
      Value domain.
      protected double epsilon
      Holds the epsilon radius threshold.
      (package private) it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap<elki.database.ids.ModifiableDBIDs> grid
      Data grid partitioning.
      protected double gridwidth
      Width of the grid cells.
      protected int minpts
      Holds the minimum cluster size.
      protected static int NOISE
      Noise IDs.
      protected double[] offset
      Grid offset.
      private boolean overflown
      Indicates that the number of grid cells has overflown.
      private elki.database.datastore.WritableIntegerDataStore temporary
      Temporary assignments of a single run.
      protected static int UNPROCESSED
      Unprocessed IDs.
    • Constructor Summary

      Constructors 
      Constructor Description
      Instance​(elki.distance.Distance<? super V> distance, double epsilon, int minpts, double gridwidth)
      Constructor.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected void buildGrid​(elki.database.relation.Relation<V> relation, int numcells, double[] offset)
      Build the data grid.
      protected Clustering<Model> buildResult​(elki.database.ids.DBIDs ids, int clusterid)
      Assemble the clustering result.
      protected int checkGridCellSizes​(int size, long numcell)
      Perform some sanity checks on the grid cells.
      private long computeGridBaseOffsets​(int size)
      Compute the grid base offset.
      protected int expandCluster​(elki.database.ids.DBIDRef seed, int clusterid, elki.database.datastore.WritableIntegerDataStore clusterids, elki.database.ids.ModifiableDoubleDBIDList neighbors, elki.database.ids.ArrayModifiableDBIDs activeSet, elki.database.query.range.RangeSearcher<elki.database.ids.DBIDRef> rq, elki.logging.progress.FiniteProgress pprog)
      Set-based expand cluster implementation.
      private void insertIntoGrid​(elki.database.ids.DBIDRef id, V obj, int d, int v)
      Insert a single object into the grid; potentially into multiple cells (at most 2^d) via recursion.
      protected void mergeClusterInformation​(elki.database.ids.ModifiableDBIDs cellids, elki.database.datastore.WritableIntegerDataStore temporary, elki.database.datastore.WritableDataStore<Assignment> clusterids)
      Merge cluster information.
      protected int processCorePoint​(elki.database.ids.DBIDRef seed, elki.database.ids.DoubleDBIDList newneighbors, int clusterid, elki.database.datastore.WritableIntegerDataStore clusterids, elki.database.ids.ArrayModifiableDBIDs activeSet)
      Process a single core point.
      Clustering<Model> run​(elki.database.relation.Relation<V> relation)
      Performs the DBSCAN algorithm on the given database.
      private int runDBSCANOnCell​(elki.database.ids.DBIDs cellids, elki.database.relation.Relation<V> relation, elki.database.ids.ModifiableDoubleDBIDList neighbors, elki.database.ids.ArrayModifiableDBIDs activeSet, int clusterid)  
      private void updateCoreBorderObjects​(int clusterid)
      Update the shared arrays for core points (to conserve memory)
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • distance

        protected elki.distance.Distance<? super V extends elki.data.NumberVector> distance
        Distance function used.
      • epsilon

        protected double epsilon
        Holds the epsilon radius threshold.
      • minpts

        protected int minpts
        Holds the minimum cluster size.
      • gridwidth

        protected double gridwidth
        Width of the grid cells. Must be at least 2 epsilon!
      • domain

        protected double[][] domain
        Value domain.
      • dim

        protected int dim
        Dimensionality.
      • offset

        protected double[] offset
        Grid offset.
      • cells

        protected int[] cells
        Number of cells per dimension.
      • grid

        it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap<elki.database.ids.ModifiableDBIDs> grid
        Data grid partitioning.
      • cores

        private Core[] cores
        Core identifier objects (shared to conserve memory).
      • borders

        private Border[] borders
        Border identifier objects (shared to conserve memory).
      • clusterids

        private elki.database.datastore.WritableDataStore<Assignment> clusterids
        Cluster assignments.
      • temporary

        private elki.database.datastore.WritableIntegerDataStore temporary
        Temporary assignments of a single run.
      • overflown

        private boolean overflown
        Indicates that the number of grid cells has overflown.
    • Constructor Detail

      • Instance

        public Instance​(elki.distance.Distance<? super V> distance,
                        double epsilon,
                        int minpts,
                        double gridwidth)
        Constructor.
        Parameters:
        distance - Distance function
        epsilon - Epsilon
        minpts - MinPts
        gridwidth - Grid width
    • Method Detail

      • run

        public Clustering<Model> run​(elki.database.relation.Relation<V> relation)
        Performs the DBSCAN algorithm on the given database.
        Parameters:
        relation - Relation to process
      • runDBSCANOnCell

        private int runDBSCANOnCell​(elki.database.ids.DBIDs cellids,
                                    elki.database.relation.Relation<V> relation,
                                    elki.database.ids.ModifiableDoubleDBIDList neighbors,
                                    elki.database.ids.ArrayModifiableDBIDs activeSet,
                                    int clusterid)
      • updateCoreBorderObjects

        private void updateCoreBorderObjects​(int clusterid)
        Update the shared arrays for core points (to conserve memory)
        Parameters:
        clusterid - Number of clusters
      • computeGridBaseOffsets

        private long computeGridBaseOffsets​(int size)
        Compute the grid base offset.
        Parameters:
        size - Data set size
        Returns:
        Total number of grid cells
      • buildGrid

        protected void buildGrid​(elki.database.relation.Relation<V> relation,
                                 int numcells,
                                 double[] offset)
        Build the data grid.
        Parameters:
        relation - Data relation
        numcells - Total number of cells
        offset - Offset
      • insertIntoGrid

        private void insertIntoGrid​(elki.database.ids.DBIDRef id,
                                    V obj,
                                    int d,
                                    int v)
        Insert a single object into the grid; potentially into multiple cells (at most 2^d) via recursion.
        Parameters:
        id - Object ID
        obj - Object
        d - Current dimension
        v - Current cell value
      • checkGridCellSizes

        protected int checkGridCellSizes​(int size,
                                         long numcell)
        Perform some sanity checks on the grid cells.
        Parameters:
        numcell - Number of cells
        size - Relation size
        Returns:
        Number of cells with minPts points
      • expandCluster

        protected int expandCluster​(elki.database.ids.DBIDRef seed,
                                    int clusterid,
                                    elki.database.datastore.WritableIntegerDataStore clusterids,
                                    elki.database.ids.ModifiableDoubleDBIDList neighbors,
                                    elki.database.ids.ArrayModifiableDBIDs activeSet,
                                    elki.database.query.range.RangeSearcher<elki.database.ids.DBIDRef> rq,
                                    elki.logging.progress.FiniteProgress pprog)
        Set-based expand cluster implementation.
        Parameters:
        clusterid - ID of the current cluster.
        clusterids - Current object to cluster mapping.
        neighbors - Neighbors acquired by initial getNeighbors call.
        activeSet - Set to manage active candidates.
        rq - Range query
        pprog - Object progress
        Returns:
        cluster size
      • processCorePoint

        protected int processCorePoint​(elki.database.ids.DBIDRef seed,
                                       elki.database.ids.DoubleDBIDList newneighbors,
                                       int clusterid,
                                       elki.database.datastore.WritableIntegerDataStore clusterids,
                                       elki.database.ids.ArrayModifiableDBIDs activeSet)
        Process a single core point.
        Parameters:
        seed - Point to process
        newneighbors - New neighbors
        clusterid - Cluster to add to
        clusterids - Cluster assignment storage.
        activeSet - Active set of cluster seeds
        Returns:
        Number of new points added to cluster
      • mergeClusterInformation

        protected void mergeClusterInformation​(elki.database.ids.ModifiableDBIDs cellids,
                                               elki.database.datastore.WritableIntegerDataStore temporary,
                                               elki.database.datastore.WritableDataStore<Assignment> clusterids)
        Merge cluster information.
        Parameters:
        cellids - IDs in current cell
        temporary - Temporary assignments
        clusterids - Merged cluster assignment
      • buildResult

        protected Clustering<Model> buildResult​(elki.database.ids.DBIDs ids,
                                                int clusterid)
        Assemble the clustering result.
        Parameters:
        ids - Object IDs
        clusterid - Largest valid cluster number
        Returns:
        Clustering