Class DiscretizationHelper

  • Type Parameters:
    I - The instance type
    All Implemented Interfaces:
    org.api4.java.common.control.ILoggingCustomizable

    public class DiscretizationHelper
    extends java.lang.Object
    implements org.api4.java.common.control.ILoggingCustomizable
    This helper class provides methods that are required in order to discretize numeric attributes.
    • Constructor Detail

      • DiscretizationHelper

        public DiscretizationHelper()
    • Method Detail

      • createDefaultDiscretizationPolicies

        public java.util.Map<java.lang.Integer,​AttributeDiscretizationPolicy> createDefaultDiscretizationPolicies​(org.api4.java.ai.ml.core.dataset.IDataset<?> dataset,
                                                                                                                        java.util.List<java.lang.Integer> indices,
                                                                                                                        java.util.Map<java.lang.Integer,​java.util.Set<java.lang.Object>> attributeValues,
                                                                                                                        DiscretizationHelper.DiscretizationStrategy discretizationStrategy,
                                                                                                                        int numberOfCategories)
        This method creates a default discretization policy for each numeric attribute in the attributes that have to be considered for stratum assignment.
        Parameters:
        dataset - The data set that has to be sampled
        indices - Indices of the attributes that have to be considered for stratum assignment
        attributeValues - Values of the relevant attributes
        discretizationStrategy - The discretization strategy that has to be used
        numberOfCategories - The number of categories to which the numeric values have to be assigned
        Returns:
      • equalSizePolicy

        public AttributeDiscretizationPolicy equalSizePolicy​(java.util.List<java.lang.Double> numericValues,
                                                             int numberOfCategories)
        Creates an equal size policy for the given values with respect to the given number of categories. An equal size policy is a policy where the length of the intervals is chosen such that in each interval there are equally many values.
        Parameters:
        numericValues - Distinct attribute values in ascending order
        numberOfCategories - Number of categories
        Returns:
        The created discretization policy consisting of one interval per category
      • equalLengthPolicy

        public AttributeDiscretizationPolicy equalLengthPolicy​(java.util.List<java.lang.Double> numericValues,
                                                               int numberOfCategories)
        Creates an equal length policy for the given values with respect to the given number of categories. An equal length policy is a policy where the length of the intervals is the same for all intervals.
        Parameters:
        numericValues - Distinct attribute values in ascending order
        numberOfCategories - Number of categories
        Returns:
        The created discretization policy consisting of one interval per category
      • discretizeAttributeValues

        protected void discretizeAttributeValues​(java.util.Map<java.lang.Integer,​AttributeDiscretizationPolicy> discretizationPolicies,
                                                 java.util.Map<java.lang.Integer,​java.util.Set<java.lang.Object>> attributeValues)
        Discretizes the given attribute values with respect to the provided policies
        Parameters:
        discretizationPolicies -
        attributeValues -
      • discretize

        protected int discretize​(double value,
                                 AttributeDiscretizationPolicy policy)
        Discretizes the particular provided value. Discretization in this case means to replace the original value by a categorical value. The categorical value is simply the index of the interval the value was assigned to.
        Parameters:
        value - The (numeric) value to be discretized
        policy - The policy that has to be used for discretization
        Returns:
      • getLoggerName

        public java.lang.String getLoggerName()
        Specified by:
        getLoggerName in interface org.api4.java.common.control.ILoggingCustomizable
      • setLoggerName

        public void setLoggerName​(java.lang.String name)
        Specified by:
        setLoggerName in interface org.api4.java.common.control.ILoggingCustomizable