@Experimental
public abstract class SqlTransform
extends org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PInput,org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.Row>>
SqlTransform is the DSL interface of Beam SQL. It translates a SQL query as a PTransform, so developers can use standard SQL queries in a Beam pipeline.
A typical pipeline with Beam SQL DSL is:
PipelineOptions options = PipelineOptionsFactory.create();
Pipeline p = Pipeline.create(options);
//create table from TextIO;
PCollection<Row> inputTableA = p.apply(TextIO.read().from("/my/input/patha")).apply(...);
PCollection<Row> inputTableB = p.apply(TextIO.read().from("/my/input/pathb")).apply(...);
//run a simple query, and register the output as a table in BeamSql;
String sql1 = "select MY_FUNC(c1), c2 from PCOLLECTION";
PCollection<Row> outputTableA = inputTableA.apply(
SqlTransform
.query(sql1)
.addUdf("MY_FUNC", MY_FUNC.class, "FUNC");
//run a JOIN with one table from TextIO, and one table from another query
PCollection<Row> outputTableB =
PCollectionTuple
.of(new TupleTag<>("TABLE_O_A"), outputTableA)
.and(new TupleTag<>("TABLE_B"), inputTableB)
.apply(SqlTransform.query("select * from TABLE_O_A JOIN TABLE_B where ..."));
//output the final result with TextIO
outputTableB.apply(...).apply(TextIO.write().to("/my/output/path"));
p.run().waitUntilFinish();
| Constructor and Description |
|---|
SqlTransform() |
| Modifier and Type | Method and Description |
|---|---|
org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.Row> |
expand(org.apache.beam.sdk.values.PInput input) |
static SqlTransform |
query(java.lang.String queryString)
Returns a
SqlTransform representing an equivalent execution plan. |
SqlTransform |
registerUdaf(java.lang.String functionName,
org.apache.beam.sdk.transforms.Combine.CombineFn combineFn)
register a
Combine.CombineFn as UDAF function used in this query. |
SqlTransform |
registerUdf(java.lang.String functionName,
java.lang.Class<? extends BeamSqlUdf> clazz)
register a UDF function used in this query.
|
SqlTransform |
registerUdf(java.lang.String functionName,
org.apache.beam.sdk.transforms.SerializableFunction sfn)
Register
SerializableFunction as a UDF function used in this query. |
SqlTransform |
withAutoUdfUdafLoad(boolean autoUdfUdafLoad) |
SqlTransform |
withDefaultTableProvider(java.lang.String name,
TableProvider tableProvider) |
SqlTransform |
withTableProvider(java.lang.String name,
TableProvider tableProvider) |
public org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.Row> expand(org.apache.beam.sdk.values.PInput input)
expand in class org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PInput,org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.Row>>public static SqlTransform query(java.lang.String queryString)
SqlTransform representing an equivalent execution plan.
The SqlTransform can be applied to a PCollection or PCollectionTuple
representing all the input tables.
The PTransform outputs a PCollection of Row.
If the PTransform is applied to PCollection then it gets registered with
name PCOLLECTION.
If the PTransform is applied to PCollectionTuple then TupleTag.getId() is used as the corresponding PCollections name.
PCollectionTuple,
this is valid;
PCollectionTuple, an IllegalStateException is thrown during query validati on;
PCollectionTuple are only valid in the scope of
the current query call.
public SqlTransform withTableProvider(java.lang.String name, TableProvider tableProvider)
public SqlTransform withDefaultTableProvider(java.lang.String name, TableProvider tableProvider)
public SqlTransform withAutoUdfUdafLoad(boolean autoUdfUdafLoad)
public SqlTransform registerUdf(java.lang.String functionName, java.lang.Class<? extends BeamSqlUdf> clazz)
Refer to BeamSqlUdf for more about how to implement a UDF in BeamSql.
public SqlTransform registerUdf(java.lang.String functionName, org.apache.beam.sdk.transforms.SerializableFunction sfn)
SerializableFunction as a UDF function used in this query. Note, SerializableFunction must have a constructor without arguments.public SqlTransform registerUdaf(java.lang.String functionName, org.apache.beam.sdk.transforms.Combine.CombineFn combineFn)
Combine.CombineFn as UDAF function used in this query.