Packages

class Dictionary extends Serializable

Class that help build a dictionary either from tokenized text or from saved dictionary

Linear Supertypes
Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Dictionary
  2. Serializable
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Dictionary(directory: String)
  2. new Dictionary(sentences: Stream[Array[String]], vocabSize: Int)
  3. new Dictionary(words: Array[String], vocabSize: Int)
  4. new Dictionary(sentences: Iterator[Array[String]], vocabSize: Int)
  5. new Dictionary(dataset: RDD[Array[String]], vocabSize: Int)
  6. new Dictionary()

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )
  6. def discardVocab(): Array[String]

    Return the array of all discarded words.

  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. def getDiscardSize(): Int

    Selected words with top-k frequencies and discarded the remaining words.

    Selected words with top-k frequencies and discarded the remaining words. Return the length of the discarded words.

  12. def getIndex(word: String): Int

    return the encoding number of a word, if word does not existed in the dictionary, it will return the dictionary length as the default index.

  13. def getVocabSize(): Int

    The length of the vocabulary

  14. def getWord(index: Int): String

    return the word with regard to the index, if index is out of boundary, it will randomly return a word in the discarded word list.

    return the word with regard to the index, if index is out of boundary, it will randomly return a word in the discarded word list. If discard word list is Empty, it will randomly return a word in the existed dictionary.

  15. def getWord(index: Double): String
  16. def getWord(index: Float): String
  17. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  18. def index2Word(): Map[Int, String]
  19. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  20. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  21. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  22. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  23. def print(): Unit

    print word-to-index dictionary

  24. def printDiscard(): Unit

    print discard dictionary

  25. def save(saveFolder: String): Unit

    Save the dictionary, discarded words to the saveFolder directory.

  26. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  27. def toString(): String
    Definition Classes
    AnyRef → Any
  28. def vocabulary(): Array[String]

    Return the array of all selected words.

  29. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  30. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )
  32. def word2Index(): Map[String, Int]

    Word encoding by its index in the dictionary

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped