case class Domain(name: String, directory: String, metadata: Option[Metadata] = None, schemas: List[Schema] = Nil, comment: Option[String] = None, extensions: Option[List[String]] = None, ack: Option[String] = None) extends Product with Serializable

Let's say you are willing to import customers and orders from your Sales system. Sales is therefore the domain and customer & order are your datasets. In a DBMS, A Domain would be implemented by a DBMS schema and a dataset by a DBMS table. In BigQuery, The domain name would be the Big Query dataset name and the dataset would be implemented by a Big Query table.

name

Domain name. Make sure you use a name that may be used as a folder name on the target storage.

  • When using HDFS or Cloud Storage, files once ingested are stored in a sub-directory named after the domain name.
  • When used with BigQuery, files are ingested and sorted in tables under a dataset named after the domain name.
directory

: Folder on the local filesystem where incoming files are stored. Typically, this folder will be scanned periodically to move the dataset to the cluster for ingestion. Files located in this folder are moved to the pending folder for ingestion by the "import" command.

metadata

: Default Schema metadata. This metadata is applied to the schemas defined in this domain. Metadata properties may be redefined at the schema level. See Metadata Entity for more details.

schemas

: List of schemas for each dataset in this domain A domain ususally contains multiple schemas. Each schema defining how the contents of the input file should be parsed. See Schema for more details.

comment

: Domain Description (free text)

extensions

: recognized filename extensions. json, csv, dsv, psv are recognized by default Only files with these extensions will be moved to the pending folder.

ack

: Ack extension used for each file. ".ack" if not specified. Files are moved to the pending folder only once a file with the same name as the source file and with this extension is present. To move a file without requiring an ack file to be present, set explicitly this property to the empty string value "".

Linear Supertypes
Serializable, Serializable, Product, Equals, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Domain
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Domain(name: String, directory: String, metadata: Option[Metadata] = None, schemas: List[Schema] = Nil, comment: Option[String] = None, extensions: Option[List[String]] = None, ack: Option[String] = None)

    name

    Domain name. Make sure you use a name that may be used as a folder name on the target storage.

    • When using HDFS or Cloud Storage, files once ingested are stored in a sub-directory named after the domain name.
    • When used with BigQuery, files are ingested and sorted in tables under a dataset named after the domain name.
    directory

    : Folder on the local filesystem where incoming files are stored. Typically, this folder will be scanned periodically to move the dataset to the cluster for ingestion. Files located in this folder are moved to the pending folder for ingestion by the "import" command.

    metadata

    : Default Schema metadata. This metadata is applied to the schemas defined in this domain. Metadata properties may be redefined at the schema level. See Metadata Entity for more details.

    schemas

    : List of schemas for each dataset in this domain A domain ususally contains multiple schemas. Each schema defining how the contents of the input file should be parsed. See Schema for more details.

    comment

    : Domain Description (free text)

    extensions

    : recognized filename extensions. json, csv, dsv, psv are recognized by default Only files with these extensions will be moved to the pending folder.

    ack

    : Ack extension used for each file. ".ack" if not specified. Files are moved to the pending folder only once a file with the same name as the source file and with this extension is present. To move a file without requiring an ack file to be present, set explicitly this property to the empty string value "".

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. val ack: Option[String]
  5. def asDot(includeAllAttrs: Boolean): String
  6. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  7. def checkValidity(schemaHandler: SchemaHandler)(implicit settings: Settings): Either[List[String], Boolean]

    Is this Domain valid ? A domain is valid if :

    Is this Domain valid ? A domain is valid if :

    • The domain name is a valid attribute
    • all the schemas defined in this domain are valid
    • No schema is defined twice
    • Partitions columns are valid columns
    • The input directory is a valid path
  8. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  9. val comment: Option[String]
  10. val directory: String
  11. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  12. val extensions: Option[List[String]]
  13. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  14. def findSchema(filename: String): Option[Schema]

    Get schema from filename Schema are matched against filenames using filename patterns.

    Get schema from filename Schema are matched against filenames using filename patterns. The schema pattern that matches the filename is returned

    filename

    : dataset filename

  15. def getAck(): String

    Ack file should be present for each file to ingest.

    Ack file should be present for each file to ingest.

    returns

    the ack attribute or ".ack" by default

  16. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  17. def getExtensions(): List[String]

    List of file extensions to scan for in the domain directory

    List of file extensions to scan for in the domain directory

    returns

    the list of extensions of teh default ones : ".json", ".csv", ".dsv", ".psv"

  18. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  19. def mapping(schema: Schema)(implicit settings: Settings): Option[String]

    Load Elasticsearch template file if it exist

    Load Elasticsearch template file if it exist

    schema

    : Schema name to map to an elasticsearch index

    returns

    ES template with optionally the PROPERTIES string that will be replaced by the schema attributes dynamically computed mappings

  20. val metadata: Option[Metadata]
  21. val name: String
  22. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  23. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  24. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  25. val schemas: List[Schema]
  26. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  27. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  28. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  29. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped