Class HiCS

  • All Implemented Interfaces:
    elki.Algorithm, OutlierAlgorithm

    @Title("HiCS: High Contrast Subspaces for Density-Based Outlier Ranking")
    @Description("Algorithm to compute High Contrast Subspaces in a database as a pre-processing step for for density-based outlier ranking methods.")
    @Reference(authors="F. Keller, E. M\u00fcller, K. B\u00f6hm",
               title="HiCS: High Contrast Subspaces for Density-Based Outlier Ranking",
               booktitle="Proc. IEEE 28th Int. Conf. on Data Engineering (ICDE 2012)",
               url="https://doi.org/10.1109/ICDE.2012.88",
               bibkey="DBLP:conf/icde/KellerMB12")
    public class HiCS
    extends java.lang.Object
    implements OutlierAlgorithm
    Algorithm to compute High Contrast Subspaces for Density-Based Outlier Ranking.

    Reference:

    F. Keller, E. Müller, K. Böhm
    HiCS: High Contrast Subspaces for Density-Based Outlier Ranking
    Proc. IEEE 28th Int. Conf. on Data Engineering (ICDE 2012)

    Since:
    0.5.0
    Author:
    Jan Brusis, Erich Schubert
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  HiCS.HiCSSubspace
      BitSet that holds a contrast value as field.
      • Nested classes/interfaces inherited from interface elki.Algorithm

        elki.Algorithm.Utils
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private double alpha
      Alpha threshold.
      private int cutoff
      Candidates limit.
      private static elki.logging.Logging LOG
      The Logger for this class.
      private int m
      Monte-Carlo iterations.
      private static int MAX_RETRIES
      Maximum number of retries.
      private OutlierAlgorithm outlierAlgorithm
      Outlier detection algorithm.
      private elki.utilities.random.RandomFactory rnd
      Random generator.
      private elki.math.statistics.tests.GoodnessOfFitTest statTest
      Statistical test to use.
    • Constructor Summary

      Constructors 
      Constructor Description
      HiCS​(int m, double alpha, OutlierAlgorithm outlierAlgorithm, elki.math.statistics.tests.GoodnessOfFitTest statTest, int cutoff, elki.utilities.random.RandomFactory rnd)
      Constructor.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private java.util.ArrayList<elki.database.ids.ArrayDBIDs> buildOneDimIndexes​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation)
      Calculates "index structures" for every attribute, i.e. sorts a ModifiableArray of every DBID in the database for every dimension and stores them in a list
      private void calculateContrast​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation, HiCS.HiCSSubspace subspace, java.util.ArrayList<elki.database.ids.ArrayDBIDs> subspaceIndex, java.util.Random random)
      Calculates the actual contrast of a given subspace.
      private java.util.Set<HiCS.HiCSSubspace> calculateSubspaces​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation, java.util.ArrayList<elki.database.ids.ArrayDBIDs> subspaceIndex, java.util.Random random)
      Identifies high contrast subspaces in a given full-dimensional database.
      elki.data.type.TypeInformation[] getInputTypeRestriction()  
      OutlierResult run​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation)
      Perform HiCS on a given database.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • LOG

        private static final elki.logging.Logging LOG
        The Logger for this class.
      • MAX_RETRIES

        private static final int MAX_RETRIES
        Maximum number of retries.
        See Also:
        Constant Field Values
      • m

        private int m
        Monte-Carlo iterations.
      • alpha

        private double alpha
        Alpha threshold.
      • outlierAlgorithm

        private OutlierAlgorithm outlierAlgorithm
        Outlier detection algorithm.
      • statTest

        private elki.math.statistics.tests.GoodnessOfFitTest statTest
        Statistical test to use.
      • cutoff

        private int cutoff
        Candidates limit.
      • rnd

        private elki.utilities.random.RandomFactory rnd
        Random generator.
    • Constructor Detail

      • HiCS

        public HiCS​(int m,
                    double alpha,
                    OutlierAlgorithm outlierAlgorithm,
                    elki.math.statistics.tests.GoodnessOfFitTest statTest,
                    int cutoff,
                    elki.utilities.random.RandomFactory rnd)
        Constructor.
        Parameters:
        m - value of m
        alpha - value of alpha
        outlierAlgorithm - Inner outlier detection algorithm
        statTest - Test to use
        cutoff - Candidate limit
        rnd - Random generator
    • Method Detail

      • getInputTypeRestriction

        public elki.data.type.TypeInformation[] getInputTypeRestriction()
        Specified by:
        getInputTypeRestriction in interface elki.Algorithm
      • run

        public OutlierResult run​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation)
        Perform HiCS on a given database.
        Parameters:
        relation - the database
        Returns:
        The aggregated resulting scores that were assigned by the given outlier detection algorithm
      • buildOneDimIndexes

        private java.util.ArrayList<elki.database.ids.ArrayDBIDs> buildOneDimIndexes​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation)
        Calculates "index structures" for every attribute, i.e. sorts a ModifiableArray of every DBID in the database for every dimension and stores them in a list
        Parameters:
        relation - Relation to index
        Returns:
        List of sorted objects
      • calculateSubspaces

        private java.util.Set<HiCS.HiCSSubspace> calculateSubspaces​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation,
                                                                    java.util.ArrayList<elki.database.ids.ArrayDBIDs> subspaceIndex,
                                                                    java.util.Random random)
        Identifies high contrast subspaces in a given full-dimensional database.
        Parameters:
        relation - the relation the HiCS should be evaluated for
        subspaceIndex - Subspace indexes
        Returns:
        a set of high contrast subspaces
      • calculateContrast

        private void calculateContrast​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation,
                                       HiCS.HiCSSubspace subspace,
                                       java.util.ArrayList<elki.database.ids.ArrayDBIDs> subspaceIndex,
                                       java.util.Random random)
        Calculates the actual contrast of a given subspace.
        Parameters:
        relation - Relation to process
        subspace - Subspace
        subspaceIndex - Subspace indexes