Class COF<O>

  • Type Parameters:
    O - Object type
    All Implemented Interfaces:
    elki.Algorithm, OutlierAlgorithm

    @Title("COF: Connectivity-based Outlier Factor")
    @Reference(authors="J. Tang, Z. Chen, A. W. C. Fu, D. W. Cheung",
               title="Enhancing effectiveness of outlier detections for low density patterns",
               booktitle="In Advances in Knowledge Discovery and Data Mining",
               url="https://doi.org/10.1007/3-540-47887-6_53",
               bibkey="DBLP:conf/pakdd/TangCFC02")
    public class COF<O>
    extends java.lang.Object
    implements OutlierAlgorithm
    Connectivity-based Outlier Factor (COF).

    Reference:

    J. Tang, Z. Chen, A. W. C. Fu, D. W. Cheung
    Enhancing effectiveness of outlier detections for low density patterns.
    Advances in Knowledge Discovery and Data Mining.

    Since:
    0.7.0
    Author:
    Erich Schubert
    • Nested Class Summary

      • Nested classes/interfaces inherited from interface elki.Algorithm

        elki.Algorithm.Utils
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected elki.distance.Distance<? super O> distance
      Distance function used.
      protected int k
      The number of neighbors to query (including the query point!)
      private static elki.logging.Logging LOG
      The logger for this class.
    • Constructor Summary

      Constructors 
      Constructor Description
      COF​(elki.distance.Distance<? super O> distance, int k)
      Constructor.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected void computeAverageChainingDistances​(elki.database.query.knn.KNNSearcher<elki.database.ids.DBIDRef> knnq, elki.database.query.distance.DistanceQuery<O> dq, elki.database.ids.DBIDs ids, elki.database.datastore.WritableDoubleDataStore acds)
      Computes the average chaining distance, the average length of a path through the given set of points to each target.
      private void computeCOFScores​(elki.database.query.knn.KNNSearcher<elki.database.ids.DBIDRef> knnq, elki.database.ids.DBIDs ids, elki.database.datastore.DoubleDataStore acds, elki.database.datastore.WritableDoubleDataStore cofs, elki.math.DoubleMinMax cofminmax)
      Compute Connectivity outlier factors.
      elki.data.type.TypeInformation[] getInputTypeRestriction()  
      OutlierResult run​(elki.database.relation.Relation<O> relation)
      Runs the COF algorithm on the given database.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • LOG

        private static final elki.logging.Logging LOG
        The logger for this class.
      • distance

        protected elki.distance.Distance<? super O> distance
        Distance function used.
      • k

        protected int k
        The number of neighbors to query (including the query point!)
    • Constructor Detail

      • COF

        public COF​(elki.distance.Distance<? super O> distance,
                   int k)
        Constructor.
        Parameters:
        distance - the neighborhood distance function
        k - the number of neighbors to use for comparison (excluding the query point)
    • Method Detail

      • run

        public OutlierResult run​(elki.database.relation.Relation<O> relation)
        Runs the COF algorithm on the given database.
        Parameters:
        relation - Data to process
        Returns:
        COF outlier result
      • computeAverageChainingDistances

        protected void computeAverageChainingDistances​(elki.database.query.knn.KNNSearcher<elki.database.ids.DBIDRef> knnq,
                                                       elki.database.query.distance.DistanceQuery<O> dq,
                                                       elki.database.ids.DBIDs ids,
                                                       elki.database.datastore.WritableDoubleDataStore acds)
        Computes the average chaining distance, the average length of a path through the given set of points to each target. The authors of COF decided to approximate this value using a weighted mean that assumes every object is reached from the previous point (but actually every point could be best reachable from the first, in which case this does not make much sense.)

        TODO: can we accelerate this by using the kNN of the neighbors?

        Parameters:
        knnq - KNN query
        dq - Distance query
        ids - IDs to process
        acds - Storage for average chaining distances
      • computeCOFScores

        private void computeCOFScores​(elki.database.query.knn.KNNSearcher<elki.database.ids.DBIDRef> knnq,
                                      elki.database.ids.DBIDs ids,
                                      elki.database.datastore.DoubleDataStore acds,
                                      elki.database.datastore.WritableDoubleDataStore cofs,
                                      elki.math.DoubleMinMax cofminmax)
        Compute Connectivity outlier factors.
        Parameters:
        knnq - KNN query
        ids - IDs to process
        acds - Average chaining distances
        cofs - Connectivity outlier factor storage
        cofminmax - Score minimum/maximum tracker
      • getInputTypeRestriction

        public elki.data.type.TypeInformation[] getInputTypeRestriction()
        Specified by:
        getInputTypeRestriction in interface elki.Algorithm