Class GeneratorXMLDatabaseConnection

  • All Implemented Interfaces:
    elki.datasource.DatabaseConnection

    public class GeneratorXMLDatabaseConnection
    extends elki.datasource.AbstractDatabaseConnection
    Data source from an XML specification.

    This data source will generate random (or pseudo-random, fixed seeds are supported) data sets that satisfy a given specification file.

    Since:
    0.2
    Author:
    Erich Schubert
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static java.lang.String ATTR_ANGLE
      Rotation angle
      static java.lang.String ATTR_AXIS1
      First axis for rotation plane
      static java.lang.String ATTR_AXIS2
      Second axis for rotation plane
      static java.lang.String ATTR_DENSITY
      Density correction factor
      static java.lang.String ATTR_K
      Gamma k
      static java.lang.String ATTR_MAX
      Maximum value
      static java.lang.String ATTR_MEAN
      Mean
      static java.lang.String ATTR_MIN
      Minimum value
      static java.lang.String ATTR_NAME
      Cluster nane
      static java.lang.String ATTR_SEED
      Random seed
      static java.lang.String ATTR_SIZE
      Cluster size
      static java.lang.String ATTR_STDDEV
      Standard deviation
      static java.lang.String ATTR_TEST
      Attribute to control model testing
      static java.lang.String ATTR_THETA
      Gamma theta
      static java.lang.String ATTR_VECTOR
      Vector
      private elki.utilities.random.RandomFactory clusterRandom
      Random generator used for initializing cluster generators.
      static java.lang.String GENERATOR_SCHEMA_FILE
      File name of the generators XML Schema file.
      private static elki.logging.Logging LOG
      Logger
      (package private) java.util.regex.Pattern reassign
      Pattern for clusters to reassign.
      private boolean reassignByDistance
      Reassign objects by distance instead of density
      (package private) double sizescale
      Parameter for scaling the cluster sizes.
      (package private) java.net.URI specfile
      The configuration file.
      static java.lang.String TAG_CLIP
      Clipping
      static java.lang.String TAG_CLUSTER
      Cluster tag
      static java.lang.String TAG_DATASET
      Dataset tag
      static java.lang.String TAG_GAMMA
      Gamma distribution
      static java.lang.String TAG_HALTON
      Halton pseudo uniform distribution.
      static java.lang.String TAG_NORMAL
      Normal distribution
      static java.lang.String TAG_POINT
      Point in static cluster
      static java.lang.String TAG_ROTATE
      Rotation
      static java.lang.String TAG_STATIC
      Static cluster
      static java.lang.String TAG_TRANSLATE
      Translation
      static java.lang.String TAG_UNIFORM
      Uniform distribution
      private java.lang.Boolean testAgainstModel
      Set testAgainstModel flag
      static java.util.regex.Pattern WHITESPACE_PATTERN
      A pattern defining whitespace.
      • Fields inherited from class elki.datasource.AbstractDatabaseConnection

        filters, LABEL_CONCATENATION
    • Constructor Summary

      Constructors 
      Constructor Description
      GeneratorXMLDatabaseConnection​(java.util.List<? extends elki.datasource.filter.ObjectFilter> filters, java.net.URI specfile, double sizescale, java.util.regex.Pattern reassign, boolean reassignByDistance, elki.utilities.random.RandomFactory clusterRandom)
      Constructor.
    • Field Detail

      • TAG_DATASET

        public static final java.lang.String TAG_DATASET
        Dataset tag
        See Also:
        Constant Field Values
      • TAG_CLUSTER

        public static final java.lang.String TAG_CLUSTER
        Cluster tag
        See Also:
        Constant Field Values
      • TAG_UNIFORM

        public static final java.lang.String TAG_UNIFORM
        Uniform distribution
        See Also:
        Constant Field Values
      • TAG_NORMAL

        public static final java.lang.String TAG_NORMAL
        Normal distribution
        See Also:
        Constant Field Values
      • TAG_GAMMA

        public static final java.lang.String TAG_GAMMA
        Gamma distribution
        See Also:
        Constant Field Values
      • TAG_HALTON

        public static final java.lang.String TAG_HALTON
        Halton pseudo uniform distribution.
        See Also:
        Constant Field Values
      • TAG_TRANSLATE

        public static final java.lang.String TAG_TRANSLATE
        Translation
        See Also:
        Constant Field Values
      • TAG_STATIC

        public static final java.lang.String TAG_STATIC
        Static cluster
        See Also:
        Constant Field Values
      • TAG_POINT

        public static final java.lang.String TAG_POINT
        Point in static cluster
        See Also:
        Constant Field Values
      • ATTR_TEST

        public static final java.lang.String ATTR_TEST
        Attribute to control model testing
        See Also:
        Constant Field Values
      • ATTR_DENSITY

        public static final java.lang.String ATTR_DENSITY
        Density correction factor
        See Also:
        Constant Field Values
      • ATTR_STDDEV

        public static final java.lang.String ATTR_STDDEV
        Standard deviation
        See Also:
        Constant Field Values
      • ATTR_THETA

        public static final java.lang.String ATTR_THETA
        Gamma theta
        See Also:
        Constant Field Values
      • ATTR_AXIS1

        public static final java.lang.String ATTR_AXIS1
        First axis for rotation plane
        See Also:
        Constant Field Values
      • ATTR_AXIS2

        public static final java.lang.String ATTR_AXIS2
        Second axis for rotation plane
        See Also:
        Constant Field Values
      • ATTR_ANGLE

        public static final java.lang.String ATTR_ANGLE
        Rotation angle
        See Also:
        Constant Field Values
      • LOG

        private static final elki.logging.Logging LOG
        Logger
      • WHITESPACE_PATTERN

        public static final java.util.regex.Pattern WHITESPACE_PATTERN
        A pattern defining whitespace.
      • GENERATOR_SCHEMA_FILE

        public static final java.lang.String GENERATOR_SCHEMA_FILE
        File name of the generators XML Schema file.
      • specfile

        java.net.URI specfile
        The configuration file.
      • sizescale

        double sizescale
        Parameter for scaling the cluster sizes.
      • reassign

        java.util.regex.Pattern reassign
        Pattern for clusters to reassign.
      • clusterRandom

        private elki.utilities.random.RandomFactory clusterRandom
        Random generator used for initializing cluster generators.
      • testAgainstModel

        private java.lang.Boolean testAgainstModel
        Set testAgainstModel flag
      • reassignByDistance

        private boolean reassignByDistance
        Reassign objects by distance instead of density
    • Constructor Detail

      • GeneratorXMLDatabaseConnection

        public GeneratorXMLDatabaseConnection​(java.util.List<? extends elki.datasource.filter.ObjectFilter> filters,
                                              java.net.URI specfile,
                                              double sizescale,
                                              java.util.regex.Pattern reassign,
                                              boolean reassignByDistance,
                                              elki.utilities.random.RandomFactory clusterRandom)
        Constructor.
        Parameters:
        filters - Filters.
        specfile - Specification file
        sizescale - Size scaling
        reassign - Reassignment pattern
        reassignByDistance - Reassign objects by distance instead of density
        clusterRandom - Random number generator
    • Method Detail

      • loadData

        public elki.datasource.bundle.MultipleObjectsBundle loadData()
      • loadXMLSpecification

        private GeneratorMain loadXMLSpecification()
        Load the XML configuration file.
        Returns:
        Generator
      • processElementDataset

        private void processElementDataset​(GeneratorMain gen,
                                           org.w3c.dom.Node cur)
        Process a 'dataset' Element in the XML stream.
        Parameters:
        gen - Generator
        cur - Current document nod
      • processElementCluster

        private void processElementCluster​(GeneratorMain gen,
                                           org.w3c.dom.Node cur)
        Process a 'cluster' Element in the XML stream.
        Parameters:
        gen - Generator
        cur - Current document nod
      • processElementUniform

        private void processElementUniform​(GeneratorSingleCluster cluster,
                                           org.w3c.dom.Node cur)
        Process a 'uniform' Element in the XML stream.
        Parameters:
        cluster -
        cur - Current document nod
      • processElementNormal

        private void processElementNormal​(GeneratorSingleCluster cluster,
                                          org.w3c.dom.Node cur)
        Process a 'normal' Element in the XML stream.
        Parameters:
        cluster -
        cur - Current document nod
      • processElementGamma

        private void processElementGamma​(GeneratorSingleCluster cluster,
                                         org.w3c.dom.Node cur)
        Process a 'gamma' Element in the XML stream.
        Parameters:
        cluster -
        cur - Current document nod
      • processElementHalton

        private void processElementHalton​(GeneratorSingleCluster cluster,
                                          org.w3c.dom.Node cur)
        Process a 'halton' Element in the XML stream.
        Parameters:
        cluster -
        cur - Current document nod
      • processElementRotate

        private void processElementRotate​(GeneratorSingleCluster cluster,
                                          org.w3c.dom.Node cur)
        Process a 'rotate' Element in the XML stream.
        Parameters:
        cluster -
        cur - Current document nod
      • processElementTranslate

        private void processElementTranslate​(GeneratorSingleCluster cluster,
                                             org.w3c.dom.Node cur)
        Process a 'translate' Element in the XML stream.
        Parameters:
        cluster -
        cur - Current document nod
      • processElementClipping

        private void processElementClipping​(GeneratorSingleCluster cluster,
                                            org.w3c.dom.Node cur)
        Process a 'clipping' Element in the XML stream.
        Parameters:
        cluster -
        cur - Current document nod
      • processElementStatic

        private void processElementStatic​(GeneratorMain gen,
                                          org.w3c.dom.Node cur)
        Process a 'static' cluster Element in the XML stream.
        Parameters:
        gen - Generator
        cur - Current document nod
      • processElementPoint

        private void processElementPoint​(java.util.List<double[]> points,
                                         org.w3c.dom.Node cur)
        Parse a 'point' element (point vector for a static cluster)
        Parameters:
        points - current list of points (to append to)
        cur - Current document nod
      • parseVector

        private double[] parseVector​(java.lang.String s)
        Parse a string into a vector.

        TODO: Rewrite this using the new Tokenizer

        Parameters:
        s - String to parse
        Returns:
        Vector
      • getLogger

        protected elki.logging.Logging getLogger()
        Specified by:
        getLogger in class elki.datasource.AbstractDatabaseConnection