Packages

c

com.ebiznext.comet.job.infer

InferSchemaJob

class InferSchemaJob extends AnyRef

* Infers the schema of a given datapath, domain name, schema name.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. InferSchemaJob
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new InferSchemaJob()(implicit settings: Settings)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. def createDataFrameWithFormat(lines: List[String], dataPath: String, header: Boolean): DataFrame

    Create the dataframe with its associated format

    Create the dataframe with its associated format

    lines

    : list of lines read from file

  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. def getDomainDirectoryName(path: Path): String

    Get domain directory name

    Get domain directory name

    path

    : file path

    returns

    the domain directory name

  12. def getFormatFile(lines: List[String]): String

    Get format file by using the first and the last line of the dataset We use mapPartitionsWithIndex to retrieve these information to make sure that the first line really corresponds to the first line (same for the last)

    Get format file by using the first and the last line of the dataset We use mapPartitionsWithIndex to retrieve these information to make sure that the first line really corresponds to the first line (same for the last)

    lines

    : list of lines read from file

  13. def getSchemaPattern(path: Path): String

    Get schema pattern

    Get schema pattern

    path

    : file path

    returns

    the schema pattern

  14. def getSeparator(lines: List[String]): String

    Get separator file by taking the character that appears the most in 10 lines of the dataset

    Get separator file by taking the character that appears the most in 10 lines of the dataset

    lines

    : list of lines read from file

    returns

    the file separator

  15. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  16. def infer(domainName: String, schemaName: String, dataPath: String, savePath: String, header: Boolean): Try[Unit]

    Just to force any spark job to implement its entry point using within the "run" method

    Just to force any spark job to implement its entry point using within the "run" method

    returns

    : Spark Session used for the job

  17. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  18. def name: String
  19. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  20. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  21. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  22. def readFile(path: Path): Dataset[String]

    Read file without specifying the format

    Read file without specifying the format

    path

    : file path

    returns

    a dataset of string that contains data file

  23. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  24. def toString(): String
    Definition Classes
    AnyRef → Any
  25. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  27. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from AnyRef

Inherited from Any

Ungrouped