Class AbstractAggarwalYuOutlier

  • All Implemented Interfaces:
    elki.Algorithm, OutlierAlgorithm
    Direct Known Subclasses:
    AggarwalYuEvolutionary, AggarwalYuNaive

    @Reference(authors="C. C. Aggarwal, P. S. Yu",
               title="Outlier detection for high dimensional data",
               booktitle="Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD 2001)",
               url="https://doi.org/10.1145/375663.375668",
               bibkey="DBLP:conf/sigmod/AggarwalY01")
    public abstract class AbstractAggarwalYuOutlier
    extends java.lang.Object
    implements OutlierAlgorithm
    Abstract base class for the sparse-grid-cell based outlier detection of Aggarwal and Yu.

    Reference:

    Outlier detection for high dimensional data
    C. C. Aggarwal, P. S. Yu
    Proc. 2001 ACM SIGMOD international conference on Management of data

    Since:
    0.4.0
    Author:
    Ahmed Hettab, Erich Schubert
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  AbstractAggarwalYuOutlier.Par
      Parameterization class.
      • Nested classes/interfaces inherited from interface elki.Algorithm

        elki.Algorithm.Utils
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static short DONT_CARE
      Symbolic value for subspaces not in use.
      static short GENE_OFFSET
      The first bucket.
      protected int k
      The target dimensionality.
      protected int phi
      The number of partitions for each dimension.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected java.util.ArrayList<java.util.ArrayList<elki.database.ids.DBIDs>> buildRanges​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation)
      Grid discretization of the data:
      Each attribute of data is divided into phi equi-depth ranges.
      Each range contains a fraction f=1/phi of the records.
      protected elki.database.ids.DBIDs computeSubspace​(int[] subspace, java.util.ArrayList<java.util.ArrayList<elki.database.ids.DBIDs>> ranges)
      Method to get the ids in the given subspace.
      protected elki.database.ids.DBIDs computeSubspaceForGene​(short[] gene, java.util.ArrayList<java.util.ArrayList<elki.database.ids.DBIDs>> ranges)
      Get the DBIDs in the current subspace.
      elki.data.type.TypeInformation[] getInputTypeRestriction()  
      protected static double sparsity​(int setsize, int dbsize, int k, double phi)
      Method to calculate the sparsity coefficient of.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • DONT_CARE

        public static final short DONT_CARE
        Symbolic value for subspaces not in use.
        See Also:
        Constant Field Values
      • phi

        protected int phi
        The number of partitions for each dimension.
      • k

        protected int k
        The target dimensionality.
    • Constructor Detail

      • AbstractAggarwalYuOutlier

        public AbstractAggarwalYuOutlier​(int k,
                                         int phi)
        Constructor.
        Parameters:
        k - K parameter
        phi - Phi parameter
    • Method Detail

      • buildRanges

        protected java.util.ArrayList<java.util.ArrayList<elki.database.ids.DBIDs>> buildRanges​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation)
        Grid discretization of the data:
        Each attribute of data is divided into phi equi-depth ranges.
        Each range contains a fraction f=1/phi of the records.
        Parameters:
        relation - Relation to process
        Returns:
        range map
      • sparsity

        protected static double sparsity​(int setsize,
                                         int dbsize,
                                         int k,
                                         double phi)
        Method to calculate the sparsity coefficient of.
        Parameters:
        setsize - Size of subset
        dbsize - Size of database
        k - Dimensionality
        phi - Phi parameter
        Returns:
        sparsity coefficient
      • computeSubspace

        protected elki.database.ids.DBIDs computeSubspace​(int[] subspace,
                                                          java.util.ArrayList<java.util.ArrayList<elki.database.ids.DBIDs>> ranges)
        Method to get the ids in the given subspace.
        Parameters:
        subspace - Subspace to process
        ranges - List of DBID ranges
        Returns:
        ids
      • computeSubspaceForGene

        protected elki.database.ids.DBIDs computeSubspaceForGene​(short[] gene,
                                                                 java.util.ArrayList<java.util.ArrayList<elki.database.ids.DBIDs>> ranges)
        Get the DBIDs in the current subspace.
        Parameters:
        gene - gene data
        ranges - Database ranges
        Returns:
        resulting DBIDs
      • getInputTypeRestriction

        public elki.data.type.TypeInformation[] getInputTypeRestriction()
        Specified by:
        getInputTypeRestriction in interface elki.Algorithm