com.eharmony.spotz.objective.vw.util

VwCrossValidation

trait VwCrossValidation extends VwDatasetFunctions

Perform kFold CrossValidation on VW Dataset.

Linear Supertypes
VwDatasetFunctions, FileFunctions, AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. VwCrossValidation
  2. VwDatasetFunctions
  3. FileFunctions
  4. AnyRef
  5. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def get(name: String): File

    Definition Classes
    FileFunctions
  2. abstract def save(file: File): String

    Definition Classes
    FileFunctions

Concrete Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. def getCache(name: String): File

    Definition Classes
    VwDatasetFunctions
  12. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  13. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  14. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  15. def kFold(vwDataset: Iterator[String], folds: Int, cacheBitSize: Int): Map[Int, (String, String)]

    This method takes the VW input file specified in the class constructor and partitions the file into a training set and test set for every fold.

    This method takes the VW input file specified in the class constructor and partitions the file into a training set and test set for every fold. The train and test set for every fold are then input into VW to generate cache files. These cache files are added to the SparkContext so that they'll be accessible on the executors. To keep track of the train and test set caches for every fold, a Map is used where the key is the fold number and the value is (trainingSetCachePath, testSetCachePath). These file names do NEED to be unique so that they do not collide with other file names. The entirety of this method runs on the driver. All VW input training / test set files as well as cache files are deleted upon JVM exit.

    This strategy has the downside of duplicating the dataset across every node K times. An alternative approach is to train K cache files and train the regressor K - 1 times and test on the last test cache file.

    vwDataset
    folds
    returns

    a map representation where key is the fold number and value is (trainingSetFilename, testSetFilename)

  16. def kFold(vwDataset: Iterable[String], folds: Int, cacheBitSize: Int): Map[Int, (String, String)]

  17. def kFold(inputPath: String, folds: Int, cacheBitSize: Int): Map[Int, (String, String)]

  18. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  19. final def notify(): Unit

    Definition Classes
    AnyRef
  20. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  21. def save(inputIterator: Iterator[String]): String

    Definition Classes
    FileFunctions
  22. def save(inputIterable: Iterable[String]): String

    Definition Classes
    FileFunctions
  23. def save(inputPath: String): String

    Definition Classes
    FileFunctions
  24. def saveAsCache(vwDatasetPath: String, vwCacheFilename: String, bitSize: Int): String

    Definition Classes
    VwDatasetFunctions
  25. def saveAsCache(vwDatasetIterator: Iterator[String], vwCacheFilename: String, bitSize: Int): String

    Definition Classes
    VwDatasetFunctions
  26. def saveAsCache(vwDatasetInputStream: InputStream, vwCacheFilename: String, bitSize: Int): String

    Definition Classes
    VwDatasetFunctions
  27. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  28. def toString(): String

    Definition Classes
    AnyRef → Any
  29. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  30. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from VwDatasetFunctions

Inherited from FileFunctions

Inherited from AnyRef

Inherited from Any

Ungrouped