Packages

object OrcFilters extends OrcFiltersBase

Helper object for building ORC SearchArguments, which are used for ORC predicate push-down.

Due to limitation of ORC SearchArgument builder, we had to implement separate checking and conversion passes through the Filter to make sure we only convert predicates that are known to be convertible.

An ORC SearchArgument must be built in one pass using a single builder. For example, you can't build a = 1 and b = 2 first, and then combine them into a = 1 AND b = 2. This is quite different from the cases in Spark SQL or Parquet, where complex filters can be easily built using existing simpler ones.

The annoying part is that, SearchArgument builder methods like startAnd(), startOr(), and startNot() mutate internal state of the builder instance. This forces us to translate all convertible filters with a single builder instance. However, if we try to translate a filter before checking whether it can be converted or not, we may end up with a builder whose internal state is inconsistent in the case of an inconvertible filter.

For example, to convert an And filter with builder b, we call b.startAnd() first, and then try to convert its children. Say we convert left child successfully, but find that right child is inconvertible. Alas, b.startAnd() call can't be rolled back, and b is inconsistent now.

The workaround employed here is to trim the Spark filters before trying to convert them. This way, we can only do the actual conversion on the part of the Filter that is known to be convertible.

P.S.: Hive seems to use SearchArgument together with ExprNodeGenericFuncDesc only. Usage of builder methods mentioned above can only be found in test code, where all tested filters are known to be convertible.

Linear Supertypes
OrcFiltersBase, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. OrcFilters
  2. OrcFiltersBase
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class OrcPrimitiveField(fieldName: String, fieldType: DataType) extends Product with Serializable
    Definition Classes
    OrcFiltersBase

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )
  6. def convertibleFilters(dataTypeMap: Map[String, OrcPrimitiveField], filters: Seq[Filter]): Seq[Filter]
  7. def createFilter(schema: StructType, filters: Seq[Filter]): Option[SearchArgument]

    Create ORC filter as a SearchArgument instance.

  8. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  12. def getPredicateLeafType(dataType: DataType): Type

    Get PredicateLeafType which is corresponding to the given DataType.

  13. def getSearchableTypeMap(schema: StructType, caseSensitive: Boolean): Map[String, OrcPrimitiveField]

    This method returns a map which contains ORC field name and data type.

    This method returns a map which contains ORC field name and data type. Each key represents a column; dots are used as separators for nested columns. If any part of the names contains dots, it is quoted to avoid confusion. See org.apache.spark.sql.connector.catalog.quoted for implementation details.

    BinaryType, UserDefinedType, ArrayType and MapType are ignored.

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    OrcFiltersBase
  14. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  15. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  16. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  17. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  18. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  19. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  20. def toString(): String
    Definition Classes
    AnyRef → Any
  21. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  22. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )

Inherited from OrcFiltersBase

Inherited from AnyRef

Inherited from Any

Ungrouped