object Metrics extends StrictLogging
- Alphabetic
- By Inheritance
- Metrics
- StrictLogging
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
-
case class
ContinuousMetric(name: String, function: (Column) ⇒ Column) extends Product with Serializable
Case class ContinuousMetric with all corresponding Metrics
Case class ContinuousMetric with all corresponding Metrics
- name
: the name of the variable
- function
: the metric function
- case class DiscreteMetric(name: String, function: ((Column, DataFrame)) ⇒ Column) extends Product with Serializable
- case class MetricsDatasets(continuousDF: Option[DataFrame], discreteDF: Option[DataFrame], frequenciesDF: Option[DataFrame]) extends Product with Serializable
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
categoryCountFreqDataframe(e: Column, dataInit: DataFrame): (Column, DataFrame)
Function to compute the Dataframe with Category, Count and Frequencies obtain from the initial Dataframe
Function to compute the Dataframe with Category, Count and Frequencies obtain from the initial Dataframe
- e
: column of the variable.
- dataInit
: initial DataFrame.
- returns
(Column, DataFrame) : tuple2 of the column of the variable and the initial Dataframe
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
computeContinuousMetric(dataset: DataFrame, continuousAttributes: List[String], operations: List[ContinuousMetric]): Option[DataFrame]
Function to compute the DataFrame metrics by row
Function to compute the DataFrame metrics by row
- dataset
: initial DataFrame.
- continuousAttributes
: name list of all variables.
- operations
: list of metrics you want to calculate.
- returns
DataFrame : DataFrame metric of all variables by row.
-
def
computeDiscretMetric(dataInit: DataFrame, discreteAttrs: List[String], operations: List[DiscreteMetric]): Option[DataFrame]
Function to compute and to combine all the partial DataFrame metric by variable (to get one DataFrame by row).
Function to compute and to combine all the partial DataFrame metric by variable (to get one DataFrame by row).
- dataInit
: initial DataFrame.
- discreteAttrs
: name of the variable.
- operations
: list of metrics you want to calculate.
- returns
DataFrame : DataFrame with alle the metric by variable by row
-
val
continuousMetrics: List[ContinuousMetric]
List of all available metrics
-
def
customCatCountFreq(colNameDataCatCount: (Column, DataFrame)): Column
Customize catCountFreq for discrete variable
-
def
customCategory(colNameDataCatCount: (Column, DataFrame)): Column
Customize Category for discrete variable
Customize Category for discrete variable
- colNameDataCatCount
: couple of name of the variable and the dataframe obtain from categoryCountFreqDataframe()
- returns
Column : the computed value of the function metricCategory
-
def
customCountDiscrete(colNameDataCatCount: (Column, DataFrame)): Column
Customize Count Discrete for discrete variable
Customize Count Discrete for discrete variable
- colNameDataCatCount
: couple of name of the variable and the dataframe obtain from categoryCountFreqDataframe()
- returns
Column : the computed value of the function metricCountDiscret
-
def
customCountDistinct(colNameDataCatCount: (Column, DataFrame)): Column
Customize CountDistinct for discrete variable
Customize CountDistinct for discrete variable
- colNameDataCatCount
: couple of name of the variable and the dataframe obtain from categoryCountFreqDataframe()
- returns
Column : the computed value of the function metricCountDistinct
-
def
customCountMissValues(e: Column): Column
Customize missing values
Customize missing values
- e
: the column
- returns
Integer : the number of missing values, NaN values and null values
-
def
customCountMissValuesDiscrete(colNameDataCatCount: (Column, DataFrame)): Column
Customize number of Missing Values for discrete variable
Customize number of Missing Values for discrete variable
- colNameDataCatCount
: couple of name of the variable and the dataframe obtain from categoryCountFreqDataframe()
- returns
Column : the computed value of the function metricMissingValues
-
def
customFrequencies(colNameDataCatCount: (Column, DataFrame)): Column
Customize Count Distinct for discrete variable
Customize Count Distinct for discrete variable
- colNameDataCatCount
: couple of name of the variable and the dataframe obtain from categoryCountFreqDataframe()
- returns
Column : the computed value of the function metricCountDistinct
-
def
customMean(e: Column): Column
Customize mean of the column e
Customize mean of the column e
- e
: the column
- returns
Integer : the computed value of the mean
-
def
customMedian(e: Column): Column
Customize Median of the column e
Customize Median of the column e
- e
: the column
- returns
Integer : the computed value of the Median
-
def
customMetric(e: Column, metricName: String, metricFunction: (Column) ⇒ Column): Column
Customize function metric in the case continuous variabes used for : mean, variance and stddev
Customize function metric in the case continuous variabes used for : mean, variance and stddev
- e
: the column
- metricName
: the name of the metric
- metricFunction
: the metric function
- returns
: the computed value of the function
-
def
customMetricDiscret(e: Column, dataCategoryCount: DataFrame, metricName: String, metricFunction: (DataFrame) ⇒ Column): Column
Customize Metric Discret for discrete variable
Customize Metric Discret for discrete variable
- e
: name of the column
- dataCategoryCount
: the dataframe obtain from categoryCountFreqDataframe()
- metricName
: te metric name
- metricFunction
: the metric function
- returns
Column : the computed value of the function
-
def
customMetricUDF(e: Column, metricName: String, metricFunction: (String, Column*) ⇒ Column, approxMethod: String, approxValue: Double): Column
Customize function metric in the case continuous variabes used for : percentile 25, median and percentile75
Customize function metric in the case continuous variabes used for : percentile 25, median and percentile75
- e
: the column
- metricName
: the name of the metric
- metricFunction
: the metric function
- approxMethod
: the approximation method
- approxValue
: the value to pass to stat_method
-
def
customStddev(e: Column): Column
Customize Stddev of the column e
Customize Stddev of the column e
- e
: the name of the column
- returns
Integer : the computed value of the Stddev
-
def
customVariance(e: Column): Column
Customize variance of the column e
Customize variance of the column e
- e
: the name of the column
- returns
Integer : the computed value of the variance
-
def
dataToMetricData(colNamDataCatCountFreq: (Column, DataFrame), operations: List[DiscreteMetric]): DataFrame
Function to compute the Dataframe metric by variable
Function to compute the Dataframe metric by variable
- colNamDataCatCountFreq
: tuple of column variable and the Dataframe with Category, Count and Frequencies obtain from categoryCountFreqDataframe()
- operations
: list of metrics you want to calculate.
- returns
Dataframe : with all the values of discrete metrics
-
val
discreteMetrics: List[DiscreteMetric]
List of all available metrics.
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
val
logger: Logger
- Attributes
- protected
- Definition Classes
- StrictLogging
-
def
metricCatCountFreq(dataCategoryCount: DataFrame): Column
Function to extract the column that contains the list of struct cat_count_freq
-
def
metricCategory(dataCategoryCount: DataFrame): Column
Function to extract the column that contains the list of category
Function to extract the column that contains the list of category
- dataCategoryCount
: the data frame obtain from categoryCountFreqDataframe()
- returns
Column : of that contain the list of category values
-
def
metricCountDiscret(dataCategoryCount: DataFrame): Column
Function to extract the column that contains the list of CountDiscret
Function to extract the column that contains the list of CountDiscret
- dataCategoryCount
: the data frame obtain from categoryCountFreqDataframe()
- returns
Column : of that contain the list of CountDiscrete values
-
def
metricCountDistinct(dataCategoryCount: DataFrame): Column
Function to extract the column that contains the list of CountDistinct
Function to extract the column that contains the list of CountDistinct
- dataCategoryCount
: the data frame obtain from categoryCountFreqDataframe()
- returns
Column : of that contain the list of CountDistinct values
-
def
metricFrequency(dataCategoryCount: DataFrame): Column
Function to extract the column that contains the list of frequencies
Function to extract the column that contains the list of frequencies
- dataCategoryCount
: the data frame obtain from categoryCountFreqDataframe()
- returns
Column : of that contain the list of frequencies values
-
def
metricMissingValues(dataCategoryCount: DataFrame): Column
Function to extract the column that contains the list of number of Missing values
Function to extract the column that contains the list of number of Missing values
- dataCategoryCount
: the data frame obtain from categoryCountFreqDataframe()
- returns
Column : of that contain the list of Missing Values values
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
percentile25(e: Column): Column
Customize percentile of order 0.25 of the column e
Customize percentile of order 0.25 of the column e
- e
: the column
- returns
Integer : the computed value of the percentile of order 0.25
-
def
percentile75(e: Column): Column
Customize percentile of order 0.75 of the column e
Customize percentile of order 0.75 of the column e
- e
: the column
- returns
Integer : the computed value of the percentile of order 0.75
-
def
regroupContinuousMetricsByVariable(nameCol: String, metricFrame: DataFrame): DataFrame
Function to regroup and reformat all metrics for a given variable
Function to regroup and reformat all metrics for a given variable
- nameCol
: the name of the column.
- metricFrame
: the DataFrame of all the computed metrics for each variable by columns.
- returns
: the DataFrame metric associated to the variable (namecol).
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
- object CatCountFreq extends DiscreteMetric
- object CountDiscrete extends DiscreteMetric
- object CountDistinct extends DiscreteMetric
- object CountMissValues extends ContinuousMetric
- object CountMissValuesDiscrete extends DiscreteMetric
- object Kurtosis extends ContinuousMetric
- object Max extends ContinuousMetric
- object Mean extends ContinuousMetric
- object Median extends ContinuousMetric
- object Min extends ContinuousMetric
- object Percentile25 extends ContinuousMetric
- object Percentile75 extends ContinuousMetric
- object Skewness extends ContinuousMetric
- object Stddev extends ContinuousMetric
- object Sum extends ContinuousMetric
- object Variance extends ContinuousMetric