trait ReorgTableHelper extends Serializable
- Alphabetic
- By Inheritance
- ReorgTableHelper
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def fileHasDifferentTypes(fileSchema: StructType, tablePhysicalSchema: StructType): Boolean
Determine whether
fileSchemahas any column that has a type that differs fromtablePhysicalSchema.Determine whether
fileSchemahas any column that has a type that differs fromtablePhysicalSchema.- fileSchema
the current parquet schema to be checked.
- tablePhysicalSchema
the current table schema.
- returns
whether the file has any column that has a different type from table column.
- Attributes
- protected
- def fileHasExtraColumns(fileSchema: StructType, tablePhysicalSchema: StructType, protocol: Protocol, metadata: Metadata): Boolean
Determine whether
fileSchemahas any column that does not exist in thetablePhysicalSchema, this is possible by running ALTER TABLE commands, e.g., ALTER TABLE DROP COLUMN.Determine whether
fileSchemahas any column that does not exist in thetablePhysicalSchema, this is possible by running ALTER TABLE commands, e.g., ALTER TABLE DROP COLUMN.- fileSchema
the current parquet schema to be checked.
- tablePhysicalSchema
the current table schema.
- protocol
the protocol used to check
row_idandrow_commit_version.- metadata
the metadata used to check
row_idandrow_commit_version.- returns
whether the file has any dropped column.
- Attributes
- protected
- def filterParquetFiles(files: Seq[AddFile], dataPath: Path, configuration: Configuration, ignoreCorruptFiles: Boolean, assumeBinaryIsString: Boolean, assumeInt96IsTimestamp: Boolean)(filterFileFn: (StructType) => Boolean): Seq[AddFile]
- Attributes
- protected
- def filterParquetFilesOnExecutors(spark: SparkSession, files: Seq[AddFile], snapshot: Snapshot, ignoreCorruptFiles: Boolean)(filterFileFn: (StructType) => Boolean): Seq[AddFile]
Apply a filter on the list of AddFile to only keep the files that have physical parquet schema that satisfies the given filter function.
Apply a filter on the list of AddFile to only keep the files that have physical parquet schema that satisfies the given filter function.
Note: Filtering happens on the executors: **any variable captured by
filterFileFnmust be Serializable**- Attributes
- protected
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()