case class ClassificationSplitter(randomizedPivotLocation: Boolean = false) extends Splitter[Char] with Product with Serializable
Find the best split for classification problems.
Created by maxhutch on 12/2/16.
- Alphabetic
- By Inheritance
- ClassificationSplitter
- Serializable
- Serializable
- Product
- Equals
- Splitter
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new ClassificationSplitter(randomizedPivotLocation: Boolean = false)
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
- def getBestCategoricalSplit(data: Seq[(Vector[AnyVal], Char, Double)], calculator: GiniCalculator, index: Int, minCount: Int): (CategoricalSplit, Double)
-
def
getBestRealSplit(data: Seq[(Vector[AnyVal], Char, Double)], calculator: GiniCalculator, index: Int, minCount: Int, randomizePivotLocation: Boolean = false): (RealSplit, Double)
Find the best split on a continuous variable
Find the best split on a continuous variable
If randomizePivotLocation is true, the split pivots are drawn from a uniform random distribution between the two data points. Each such pivot results in the same data split, but randomization can improve generalizability, particularly as part of an ensemble (i.e. random forests).
- data
to split
- index
of the feature to split on
- minCount
minimum number of data points to allow in each of the resulting splits
- randomizePivotLocation
whether generate splits randomly between the data points (default: false)
- returns
the best split of this feature
-
def
getBestSplit(data: Seq[(Vector[AnyVal], Char, Double)], numFeatures: Int, minInstances: Int): (Split, Double)
Get the best split, considering numFeature random features (w/o replacement)
Get the best split, considering numFeature random features (w/o replacement)
- data
to split
- numFeatures
to consider, randomly
- returns
a split object that optimally divides data
- Definition Classes
- ClassificationSplitter → Splitter
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- val randomizedPivotLocation: Boolean
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )