Class Binarizer


  • public class Binarizer
    extends TransformerMixin<NumpyArray<Double>,​NumpyArray<Double>>
    Binarize data (set feature values to 0 or 1) according to a threshold. Values greater than the threshold map to 1, while values less than or equal to the threshold map to 0. With the default threshold of 0, only positive values map to 1. Binarization is a common operation on text count data where the analyst can decide to only consider the presence or absence of a feature rather than a quantified number of occurrences for instance. It can also be used as a pre-processing step for estimators that consider boolean random variables (e.g. modelled using the Bernoulli distribution in a Bayesian setting).
    • Constructor Detail

      • Binarizer

        public Binarizer()
        Instantiate a new object of Binarizer.
    • Method Detail

      • setNFeaturesIn

        public void setNFeaturesIn​(long value)
        Sets the Number of features seen during `fit`.
        Parameters:
        value - The new value for nFeaturesIn.
      • getNFeaturesIn

        public long getNFeaturesIn()
        Gets the Number of features seen during `fit`.
      • setFeatureNamesIn

        public void setFeatureNamesIn​(String[] value)
        Sets the Names of features seen during `fit`. Defined only when `X` has feature names that are all strings.
        Parameters:
        value - The new value for featureNamesIn.
      • getFeatureNamesIn

        public String[] getFeatureNamesIn()
        Gets the Names of features seen during `fit`. Defined only when `X` has feature names that are all strings.
      • getThreshold

        public double getThreshold()
        Gets the threshold for binarization.
        Returns:
        The threshold for binarization.
      • setThreshold

        public void setThreshold​(double value)
        Sets the threshold for binarization.
        Parameters:
        value - The threshold for binarization.