Packages

  • package root
    Definition Classes
    root
  • package org
    Definition Classes
    root
  • package apache
    Definition Classes
    org
  • package spark
    Definition Classes
    apache
  • package sql
    Definition Classes
    spark
  • package catalyst

    Catalyst is a library for manipulating relational query plans.

    Catalyst is a library for manipulating relational query plans. All classes in catalyst are considered an internal API to Spark SQL and are subject to change between minor releases.

    Definition Classes
    sql
  • package analysis

    Provides a logical query plan Analyzer and supporting classes for performing analysis.

    Provides a logical query plan Analyzer and supporting classes for performing analysis. Analysis consists of translating UnresolvedAttributes and UnresolvedRelations into fully typed objects using information in a schema Catalog.

    Definition Classes
    catalyst
  • package catalog
    Definition Classes
    catalyst
  • package csv
    Definition Classes
    catalyst
  • package dsl

    A collection of implicit conversions that create a DSL for constructing catalyst data structures.

    A collection of implicit conversions that create a DSL for constructing catalyst data structures.

    scala> import org.apache.spark.sql.catalyst.dsl.expressions._
    
    // Standard operators are added to expressions.
    scala> import org.apache.spark.sql.catalyst.expressions.Literal
    scala> Literal(1) + Literal(1)
    res0: org.apache.spark.sql.catalyst.expressions.Add = (1 + 1)
    
    // There is a conversion from 'symbols to unresolved attributes.
    scala> 'a.attr
    res1: org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute = 'a
    
    // These unresolved attributes can be used to create more complicated expressions.
    scala> 'a === 'b
    res2: org.apache.spark.sql.catalyst.expressions.EqualTo = ('a = 'b)
    
    // SQL verbs can be used to construct logical query plans.
    scala> import org.apache.spark.sql.catalyst.plans.logical._
    scala> import org.apache.spark.sql.catalyst.dsl.plans._
    scala> LocalRelation($"key".int, $"value".string).where('key === 1).select('value).analyze
    res3: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan =
    Project [value#3]
     Filter (key#2 = 1)
      LocalRelation [key#2,value#3], []
    Definition Classes
    catalyst
  • package encoders
    Definition Classes
    catalyst
  • package expressions

    A set of classes that can be used to represent trees of relational expressions.

    A set of classes that can be used to represent trees of relational expressions. A key goal of the expression library is to hide the details of naming and scoping from developers who want to manipulate trees of relational operators. As such, the library defines a special type of expression, a NamedExpression in addition to the standard collection of expressions.

    Standard Expressions

    A library of standard expressions (e.g., Add, EqualTo), aggregates (e.g., SUM, COUNT), and other computations (e.g. UDFs). Each expression type is capable of determining its output schema as a function of its children's output schema.

    Named Expressions

    Some expression are named and thus can be referenced by later operators in the dataflow graph. The two types of named expressions are AttributeReferences and Aliases. AttributeReferences refer to attributes of the input tuple for a given operator and form the leaves of some expression trees. Aliases assign a name to intermediate computations. For example, in the SQL statement SELECT a+b AS c FROM ..., the expressions a and b would be represented by AttributeReferences and c would be represented by an Alias.

    During analysis, all named expressions are assigned a globally unique expression id, which can be used for equality comparisons. While the original names are kept around for debugging purposes, they should never be used to check if two attributes refer to the same value, as plan transformations can result in the introduction of naming ambiguity. For example, consider a plan that contains subqueries, both of which are reading from the same table. If an optimization removes the subqueries, scoping information would be destroyed, eliminating the ability to reason about which subquery produced a given attribute.

    Evaluation

    The result of expressions can be evaluated using the Expression.apply(Row) method.

    Definition Classes
    catalyst
  • package json
    Definition Classes
    catalyst
  • package optimizer
    Definition Classes
    catalyst
  • package parser
    Definition Classes
    catalyst
  • package planning

    Contains classes for enumerating possible physical plans for a given logical query plan.

    Contains classes for enumerating possible physical plans for a given logical query plan.

    Definition Classes
    catalyst
  • package plans

    A collection of common abstractions for query plans as well as a base logical plan representation.

    A collection of common abstractions for query plans as well as a base logical plan representation.

    Definition Classes
    catalyst
  • package rules

    A framework for applying batches rewrite rules to trees, possibly to fixed point.

    A framework for applying batches rewrite rules to trees, possibly to fixed point.

    Definition Classes
    catalyst
  • package streaming
    Definition Classes
    catalyst
  • package trees

    A library for easily manipulating trees of operators.

    A library for easily manipulating trees of operators. Operators that extend TreeNode are granted the following interface:

    • Scala collection like methods (foreach, map, flatMap, collect, etc)

    - transform - accepts a partial function that is used to generate a new tree. When the partial function can be applied to a given tree segment, that segment is replaced with the result. After attempting to apply the partial function to a given node, the transform function recursively attempts to apply the function to that node's children.

    • debugging support - pretty printing, easy splicing of trees, etc.
    Definition Classes
    catalyst
  • package types
    Definition Classes
    catalyst
  • package util
    Definition Classes
    catalyst
  • ArrayBasedMapBuilder
  • ArrayBasedMapData
  • ArrayData
  • ArrayDataIndexedSeq
  • BadRecordException
  • CannotParseJSONFieldException
  • CharVarcharCodegenUtils
  • CharVarcharUtils
  • CompressionCodecs
  • DateTimeUtils
  • DropMalformedMode
  • EmptyJsonFieldValueException
  • FailFastMode
  • FailureSafeParser
  • GeneratedColumn
  • GeneratedColumnAnalyzer
  • GenericArrayData
  • HyperLogLogPlusPlusHelper
  • InternalRowComparableWrapper
  • IntervalMathUtils
  • IntervalUtils
  • JsonArraysAsStructsException
  • MapData
  • MetadataColumnHelper
  • NumberConverter
  • ParseMode
  • PartialArrayDataResultException
  • PartialMapDataResultException
  • PartialResultArrayException
  • PartialResultException
  • PartialValueException
  • PermissiveMode
  • QuantileSummaries
  • RandomIndicesGenerator
  • RandomUUIDGenerator
  • ResolveDefaultColumns
  • RowDeltaUtils
  • SQLOrderingUtil
  • StringAsDataTypeException
  • StringKeyHashMap
  • StringUtils
  • ToNumberParser
  • TypeUtils
  • UTF8StringUtils
  • UnsafeRowUtils
  • WriteDeltaProjections

package util

Linear Supertypes
Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. util
  2. Logging
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Type Members

  1. class ArrayBasedMapBuilder extends Serializable

    A builder of ArrayBasedMapData, which fails if a null map key is detected, and removes duplicated map keys w.r.t.

    A builder of ArrayBasedMapData, which fails if a null map key is detected, and removes duplicated map keys w.r.t. the last wins policy.

  2. class ArrayBasedMapData extends MapData

    A simple MapData implementation which is backed by 2 arrays.

    A simple MapData implementation which is backed by 2 arrays.

    Note that, user is responsible to guarantee that the key array does not have duplicated elements, otherwise the behavior is undefined.

  3. abstract class ArrayData extends SpecializedGetters with Serializable
  4. class ArrayDataIndexedSeq[T] extends IndexedSeq[T]

    Implements an IndexedSeq interface for ArrayData.

    Implements an IndexedSeq interface for ArrayData. Notice that if the original ArrayData is a primitive array and contains null elements, it is better to ask for IndexedSeq[Any], instead of IndexedSeq[Int], in order to keep the null elements.

  5. case class BadRecordException(record: () => UTF8String, partialResults: () => Array[InternalRow] = () => Array.empty[InternalRow], cause: Throwable) extends Exception with Product with Serializable

    Exception thrown when the underlying parser meet a bad record and can't parse it.

    Exception thrown when the underlying parser meet a bad record and can't parse it.

    record

    a function to return the record that cause the parser to fail

    partialResults

    a function that returns an row array, which is the partial results of parsing this bad record.

    cause

    the actual exception about why the record is bad and can't be parsed.

  6. case class CannotParseJSONFieldException(fieldName: String, fieldValue: String, jsonType: JsonToken, dataType: DataType) extends RuntimeException with Product with Serializable

    No-stacktrace equivalent of QueryExecutionErrors.cannotParseJSONFieldError.

    No-stacktrace equivalent of QueryExecutionErrors.cannotParseJSONFieldError. Used for code control flow in the parser without overhead of creating a full exception.

  7. class CharVarcharCodegenUtils extends AnyRef
  8. case class EmptyJsonFieldValueException(dataType: DataType) extends RuntimeException with Product with Serializable

    No-stacktrace equivalent of QueryExecutionErrors.emptyJsonFieldValueError.

    No-stacktrace equivalent of QueryExecutionErrors.emptyJsonFieldValueError. Used for code control flow in the parser without overhead of creating a full exception.

  9. class FailureSafeParser[IN] extends AnyRef
  10. class GenericArrayData extends ArrayData
  11. class HyperLogLogPlusPlusHelper extends Serializable
  12. class InternalRowComparableWrapper extends AnyRef

    Wraps the InternalRow with the corresponding DataType to make it comparable with the values in InternalRow.

    Wraps the InternalRow with the corresponding DataType to make it comparable with the values in InternalRow. It uses Spark's internal murmur hash to compute hash code from an row, and uses RowOrdering to perform equality checks.

  13. case class JsonArraysAsStructsException() extends RuntimeException with Product with Serializable

    Exception thrown when the underlying parser parses a JSON array as a struct.

  14. abstract class MapData extends Serializable

    This is an internal data representation for map type in Spark SQL.

    This is an internal data representation for map type in Spark SQL. This should not implement equals and hashCode because the type cannot be used as join keys, grouping keys, or in equality tests. See SPARK-9415 and PR#13847 for the discussions.

  15. implicit class MetadataColumnHelper extends AnyRef
  16. sealed trait ParseMode extends AnyRef
  17. case class PartialArrayDataResultException(partialResult: ArrayData, cause: Throwable) extends PartialValueException with Product with Serializable

    Exception thrown when the underlying parser returns a partial array result.

    Exception thrown when the underlying parser returns a partial array result.

    partialResult

    the partial array result.

    cause

    the actual exception about why the parser cannot return full result.

  18. case class PartialMapDataResultException(partialResult: MapData, cause: Throwable) extends PartialValueException with Product with Serializable

    Exception thrown when the underlying parser returns a partial map result.

    Exception thrown when the underlying parser returns a partial map result.

    partialResult

    the partial map result.

    cause

    the actual exception about why the parser cannot return full result.

  19. case class PartialResultArrayException(partialResults: Array[InternalRow], cause: Throwable) extends Exception with Product with Serializable

    Exception thrown when the underlying parser returns partial result list of parsing.

    Exception thrown when the underlying parser returns partial result list of parsing.

    partialResults

    the partial result list of parsing bad records.

    cause

    the actual exception about why the parser cannot return full result.

  20. case class PartialResultException(partialResult: InternalRow, cause: Throwable) extends PartialValueException with Product with Serializable

    Exception thrown when the underlying parser returns a partial result of parsing an object/row.

    Exception thrown when the underlying parser returns a partial result of parsing an object/row.

    partialResult

    the partial result of parsing a bad record.

    cause

    the actual exception about why the parser cannot return full result.

  21. abstract class PartialValueException extends Exception
  22. class QuantileSummaries extends Serializable

    Helper class to compute approximate quantile summary.

    Helper class to compute approximate quantile summary. This implementation is based on the algorithm proposed in the paper: "Space-efficient Online Computation of Quantile Summaries" by Greenwald, Michael and Khanna, Sanjeev. (https://doi.org/10.1145/375663.375670)

    In order to optimize for speed, it maintains an internal buffer of the last seen samples, and only inserts them after crossing a certain size threshold. This guarantees a near-constant runtime complexity compared to the original algorithm.

  23. case class RandomIndicesGenerator(randomSeed: Long) extends Product with Serializable

    This class is used to generate a random indices of given length.

    This class is used to generate a random indices of given length.

    This implementation uses the "inside-out" version of Fisher-Yates algorithm. Reference: https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_%22inside-out%22_algorithm

  24. case class RandomUUIDGenerator(randomSeed: Long) extends Product with Serializable

    This class is used to generate a UUID from Pseudo-Random Numbers.

    This class is used to generate a UUID from Pseudo-Random Numbers.

    For the algorithm, see RFC 4122: A Universally Unique IDentifier (UUID) URN Namespace, section 4.4 "Algorithms for Creating a UUID from Truly Random or Pseudo-Random Numbers".

  25. case class StringAsDataTypeException(fieldName: String, fieldValue: String, dataType: DataType) extends RuntimeException with Product with Serializable

    Exception thrown when the underlying parser can not parses a String as a datatype.

  26. class StringKeyHashMap[T] extends AnyRef
  27. class ToNumberParser extends Serializable

    This class represents a parser to implement the to_number or try_to_number SQL functions.

    This class represents a parser to implement the to_number or try_to_number SQL functions.

    It works by consuming an input string and a format string. This class accepts the format string as a field, and proceeds to iterate through the format string to generate a sequence of tokens (or throw an exception if the format string is invalid). Then when the function is called with an input string, this class steps through the sequence of tokens and compares them against the input string, returning a Spark Decimal object if they match (or throwing an exception otherwise).

  28. case class WriteDeltaProjections(rowProjection: Option[ProjectingInternalRow], rowIdProjection: ProjectingInternalRow, metadataProjection: Option[ProjectingInternalRow]) extends Product with Serializable

Value Members

  1. val AUTO_GENERATED_ALIAS: String
  2. val INTERNAL_METADATA_KEYS: Seq[String]
  3. val METADATA_COL_ATTR_KEY: String
  4. val QUALIFIED_ACCESS_ONLY: String

    If set, this metadata column can only be accessed with qualifiers, e.g.

    If set, this metadata column can only be accessed with qualifiers, e.g. qualifiers.col or qualifiers.*. If not set, metadata columns cannot be accessed via star.

  5. def escapeSingleQuotedString(str: String): String
  6. def fileToString(file: File, encoding: Charset = UTF_8): String
  7. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  8. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  9. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  10. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  11. def logDebug(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  12. def logDebug(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  13. def logError(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  14. def logError(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  15. def logInfo(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  16. def logInfo(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  17. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  18. def logTrace(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  19. def logTrace(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  20. def logWarning(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  21. def logWarning(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  22. def quietly[A](f: => A): A

    Silences output to stderr or stdout for the duration of f

  23. def quoteIdentifier(name: String): String
  24. def quoteIfNeeded(part: String): String
  25. def quoteNameParts(name: Seq[String]): String
  26. def removeInternalMetadata(schema: StructType): StructType
  27. def resourceToBytes(resource: String, classLoader: ClassLoader = Utils.getSparkClassLoader): Array[Byte]
  28. def resourceToString(resource: String, encoding: String = UTF_8.name(), classLoader: ClassLoader = Utils.getSparkClassLoader): String
  29. def sideBySide(left: Seq[String], right: Seq[String]): Seq[String]
  30. def sideBySide(left: String, right: String): Seq[String]
  31. def stackTraceToString(t: Throwable): String
  32. def stringToFile(file: File, str: String): File
  33. def toPrettySQL(e: Expression): String
  34. def truncatedString[T](seq: Seq[T], sep: String, maxFields: Int): String

    Shorthand for calling truncatedString() without start or end strings.

  35. def truncatedString[T](seq: Seq[T], start: String, sep: String, end: String, maxFields: Int): String

    Format a sequence with semantics similar to calling .mkString().

    Format a sequence with semantics similar to calling .mkString(). Any elements beyond maxNumToStringFields will be dropped and replaced by a "... N more fields" placeholder.

    returns

    the trimmed and formatted string.

  36. def usePrettyExpression(e: Expression): Expression
  37. object ArrayBasedMapData extends Serializable
  38. object ArrayData extends Serializable
  39. object CharVarcharUtils extends Logging with SparkCharVarcharUtils
  40. object CompressionCodecs
  41. object DateTimeUtils extends SparkDateTimeUtils

    Helper functions for converting between internal and external date and time representations.

    Helper functions for converting between internal and external date and time representations. Dates are exposed externally as java.sql.Date and are represented internally as the number of dates since the Unix epoch (1970-01-01). Timestamps are exposed externally as java.sql.Timestamp and are stored internally as longs, which are capable of storing timestamps with microsecond precision.

  42. case object DropMalformedMode extends ParseMode with Product with Serializable

    This mode ignores the whole corrupted records.

  43. case object FailFastMode extends ParseMode with Product with Serializable

    This mode throws an exception when it meets corrupted records.

  44. object GeneratedColumn

    This object contains utility methods and values for Generated Columns

  45. object GeneratedColumnAnalyzer extends Analyzer

    Analyzer for processing generated column expressions using built-in functions only.

  46. object HyperLogLogPlusPlusHelper extends Serializable

    Constants used in the implementation of the HyperLogLogPlusPlus aggregate function.

    Constants used in the implementation of the HyperLogLogPlusPlus aggregate function.

    See the Appendix to HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality (https://docs.google.com/document/d/1gyjfMHy43U9OWBXxfaeG-3MjGzejW1dlpyMwEYAAWEI/view?fullscreen) for more information.

  47. object InternalRowComparableWrapper
  48. object IntervalMathUtils

    Helper functions for interval arithmetic operations with overflow.

  49. object IntervalUtils extends SparkIntervalUtils
  50. object NumberConverter
  51. object ParseMode extends Logging
  52. case object PermissiveMode extends ParseMode with Product with Serializable

    This mode permissively parses the records.

  53. object QuantileSummaries extends Serializable
  54. object ResolveDefaultColumns extends QueryErrorsBase with ResolveDefaultColumnsUtils

    This object contains fields to help process DEFAULT columns.

  55. object RowDeltaUtils

    A utility that holds constants for handling deltas of rows.

  56. object SQLOrderingUtil
  57. object StringKeyHashMap

    Build a map with String type of key, and it also supports either key case sensitive or insensitive.

  58. object StringUtils extends Logging
  59. object ToNumberParser extends Serializable
  60. object TypeUtils extends QueryErrorsBase

    Functions to help with checking for valid data types and value comparison of various types.

  61. object UTF8StringUtils

    Helper functions for casting string to numeric values.

  62. object UnsafeRowUtils

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped