public class ExternalTableDefn extends TableDefn
The pattern is:
external table spec + parameters --> external table
Since an input source is a parameterized (partial) external table, we can reuse the table metadata structures and APIs, avoiding the need to have a separate (but otherwise identical) structure for external tables. An external table can be thought of as a "connection", though Druid does not use that term. When used as a connection, the external table spec will omit the format. Instead, the format will also be provided at ingest time, along with the list of tables (or objects.)
To keep all this straight, we adopt the following terms:
FROM clause in an MSQ query.TABLE function. If the partial
spec includes a format, then it is essentially a partial table. If it
omits the format, then it is essentially a connection.EXTERN function
or one of the input-source-specific functions. In this case, there is no
catalog entry: all information comes from the SQL table functionInputFormatDefn, and has parameters for all
support formats: the user must specify all input source and format properties.properties field will contain the source property which has
the JSON serialized form of the input source (minus items to be parameterized.)
Similarly, if the external table also defines a format (rather than requiring the
format at ingest time), then the format property holds the JSON-serialized
form of the input format, minus columns. The columns can be provided in the spec,
in the columns field. The InputFormatDefn converts the columns to
the form needed by the input format.
Druid's input sources all require formats. However, some sources may not actually
need the format. A JDBC input source for example, needs no format. In other cases,
there may be a subset of formats. Each InputSourceDefn is responsible for
working out which formats (if any) are required. This class is agnostic about whether
the format is supplied. (Remember that, when used as a connection, the external table
will provide no format until ingest time.)
By contrast, the input source is always required.
ExternalTableSpec which holds the input source, input format
and row signature in the form required by SQL.
This class handles table specifications in three forms:
ExternalTableSpec
by the convert(ResolvedTable) function.InputSourceDefn.adHocTableFn() method provides the function definition which
handles the conversion.tableFn(ResolvedTable) method
creates the required function by caching the table spec. That function then combines
the parameters to produce the required ExternalTableSpec.
To handle these formats, and the need to adjust JSON, conversion to an
ExternalTableSpec occurs in multiple steps:
InputSource or InputFormat
objects using a Jackson conversion.InputFormatDefn and
InputFormatDefn classes, either directly (for a fully-defined table
function) or starting here (for other use cases).
format property. The format type name is typically the same
as the JSON type name, but need not be.InputSourceDefn or InputFormatDefn which
are put into the TableDefnRegistry and thus available to this class. The result
is that this class is ignorant of the actual details of sources and formats: it instead
delegates to the input source and input format definitions for that work.
Input sources and input formats defined in an extension are considered "ephemeral": they can go away if the corresponding extension is removed from the system. In that case, any table functions defined by those extensions are no longer available, and any SQL statements that use those functions will no longer work. The catalog may contain an external table spec that references those definitions. Such specs will continue to reside in the catalog, and can be retrieved, but they will fail any query that attempts to reference them.
| Modifier and Type | Field and Description |
|---|---|
static String |
FORMAT_PROPERTY
Property which holds the optional input format specification, serialized as JSON.
|
static com.fasterxml.jackson.core.type.TypeReference<Map<String,Object>> |
MAP_TYPE_REF
Type reference used to deserialize JSON to a generic map.
|
static String |
SOURCE_PROPERTY
Property which holds the input source specification as serialized as JSON.
|
static String |
TABLE_TYPE
Identifier for external tables.
|
DESCRIPTION_PROPERTY| Constructor and Description |
|---|
ExternalTableDefn() |
| Modifier and Type | Method and Description |
|---|---|
void |
bind(TableDefnRegistry registry)
Called after the table definition is added to the registry, along with all
other definitions.
|
ExternalTableSpec |
convert(ResolvedTable table)
Return the
ExternalTableSpec for a catalog entry for a
fully-defined table. |
static boolean |
isExternalTable(ResolvedTable table) |
TableFunction |
tableFn(ResolvedTable table)
Return a table function definition for a partial table as given by
the catalog table spec.
|
void |
validate(ResolvedTable table)
Validate a table spec using the table, field and column definitions defined
here.
|
protected void |
validateColumn(ColumnSpec colSpec)
Table-specific validation of a column spec.
|
merge, mergeColumns, validateColumnsmergeProperties, name, properties, property, toPropertyMap, typeValue, validatepublic static final String TABLE_TYPE
public static final String SOURCE_PROPERTY
public static final String FORMAT_PROPERTY
public void bind(TableDefnRegistry registry)
TableDefnpublic void validate(ResolvedTable table)
TableDefnpublic TableFunction tableFn(ResolvedTable table)
ExternalTableSpec.protected void validateColumn(ColumnSpec colSpec)
TableDefnvalidateColumn in class TableDefnpublic ExternalTableSpec convert(ResolvedTable table)
ExternalTableSpec for a catalog entry for a
fully-defined table. This form exists for completeness, since ingestion never
reads the same data twice. This form is handy for tests, and will become
generally useful when MSQ fully supports queries and those queries can
read from external tables.public static boolean isExternalTable(ResolvedTable table)
Copyright © 2011–2023 The Apache Software Foundation. All rights reserved.