public class SparkBigQueryUtil extends Object
| Constructor and Description |
|---|
SparkBigQueryUtil() |
| Modifier and Type | Method and Description |
|---|---|
static org.apache.hadoop.fs.Path |
createGcsPath(SparkBigQueryConfig config,
org.apache.hadoop.conf.Configuration conf,
String applicationId)
Checks whether temporaryGcsBucket or persistentGcsBucket parameters are present in the config
and creates a org.apache.hadoop.fs.Path object backed by GCS.
|
static SparkBigQueryConfig |
createSparkBigQueryConfig(org.apache.spark.sql.SQLContext sqlContext,
scala.collection.immutable.Map<String,String> options,
scala.Option<org.apache.spark.sql.types.StructType> schema,
DataSourceVersion dataSourceVersion) |
static @NotNull com.google.common.collect.ImmutableMap<String,String> |
extractJobLabels(org.apache.spark.SparkConf sparkConf) |
static com.google.common.collect.ImmutableList<org.apache.spark.sql.sources.Filter> |
extractPartitionAndClusteringFilters(com.google.cloud.bigquery.TableInfo table,
com.google.common.collect.ImmutableList<org.apache.spark.sql.sources.Filter> filters) |
static String |
getJobId(org.apache.spark.sql.internal.SQLConf sqlConf) |
static String |
getTableNameFromOptions(Map<String,String> options) |
static Stream<TypeConverter> |
getTypeConverterStream() |
static boolean |
isDataFrameShowMethodInStackTrace() |
static boolean |
isJson(org.apache.spark.sql.types.Metadata metadata) |
static List<String> |
optimizeLoadUriListForSpark(List<String> uris)
Optimizing the URI list for BigQuery load, using the Spark specific file prefix and suffix
patterns, based on
BigQueryUtil.optimizeLoadUriList() |
static com.google.cloud.bigquery.TableId |
parseSimpleTableId(org.apache.spark.sql.SparkSession spark,
Map<String,String> options) |
static com.google.cloud.bigquery.JobInfo.WriteDisposition |
saveModeToWriteDisposition(org.apache.spark.sql.SaveMode saveMode) |
static <K,V> com.google.common.collect.ImmutableMap<K,V> |
scalaMapToJavaMap(scala.collection.immutable.Map<K,V> map) |
static int |
sparkDateToBigQuery(Object sparkValue) |
static long |
sparkTimestampToBigQuery(Object sparkValue) |
public static List<String> optimizeLoadUriListForSpark(List<String> uris)
BigQueryUtil.optimizeLoadUriList()uris - A list of URIs to be loaded by BigQuery loadpublic static org.apache.hadoop.fs.Path createGcsPath(SparkBigQueryConfig config, org.apache.hadoop.conf.Configuration conf, String applicationId)
config - SparkBigQueryConfigconf - Hadoop configuration parametersapplicationId - A unique identifier for the Spark applicationpublic static String getJobId(org.apache.spark.sql.internal.SQLConf sqlConf)
public static com.google.cloud.bigquery.JobInfo.WriteDisposition saveModeToWriteDisposition(org.apache.spark.sql.SaveMode saveMode)
public static com.google.cloud.bigquery.TableId parseSimpleTableId(org.apache.spark.sql.SparkSession spark,
Map<String,String> options)
public static long sparkTimestampToBigQuery(Object sparkValue)
public static int sparkDateToBigQuery(Object sparkValue)
public static <K,V> com.google.common.collect.ImmutableMap<K,V> scalaMapToJavaMap(scala.collection.immutable.Map<K,V> map)
public static boolean isDataFrameShowMethodInStackTrace()
public static boolean isJson(org.apache.spark.sql.types.Metadata metadata)
public static com.google.common.collect.ImmutableList<org.apache.spark.sql.sources.Filter> extractPartitionAndClusteringFilters(com.google.cloud.bigquery.TableInfo table,
com.google.common.collect.ImmutableList<org.apache.spark.sql.sources.Filter> filters)
public static Stream<TypeConverter> getTypeConverterStream()
@NotNull public static @NotNull com.google.common.collect.ImmutableMap<String,String> extractJobLabels(org.apache.spark.SparkConf sparkConf)
public static SparkBigQueryConfig createSparkBigQueryConfig(org.apache.spark.sql.SQLContext sqlContext, scala.collection.immutable.Map<String,String> options, scala.Option<org.apache.spark.sql.types.StructType> schema, DataSourceVersion dataSourceVersion)
Copyright © 2024. All rights reserved.