public final class Table extends Object implements AutoCloseable
| Modifier and Type | Class and Description |
|---|---|
static class |
Table.AggregateOperation
Class representing aggregate operations
|
static class |
Table.OrderByArg |
static class |
Table.TableOperation |
static class |
Table.TestBuilder
Create a table on the GPU with data from the CPU.
|
| Constructor and Description |
|---|
Table(ColumnVector... columns)
Table class makes a copy of the array of
ColumnVectors passed to it. |
| Modifier and Type | Method and Description |
|---|---|
static Table.OrderByArg |
asc(int index) |
void |
close() |
static Table |
concatenate(Table... tables)
Concatenate multiple tables together to form a single table.
|
static Aggregate |
count(int index) |
static Table.OrderByArg |
desc(int index) |
Table |
filter(ColumnVector mask)
Filters this table using a column of boolean values as a mask, returning a new one.
|
ColumnVector |
getColumn(int index)
Return the
ColumnVector at the specified index. |
int |
getNumberOfColumns() |
long |
getRowCount() |
Table.AggregateOperation |
groupBy(GroupByOptions groupByOptions,
int... indices) |
Table.AggregateOperation |
groupBy(int... indices) |
static Aggregate |
max(int index) |
static Aggregate |
mean(int index) |
static Aggregate |
min(int index) |
Table.TableOperation |
onColumns(int... indices) |
Table |
orderBy(boolean areNullsSmallest,
Table.OrderByArg... args)
Orders the table using the sortkeys returning a new allocated table.
|
static Table |
readCSV(Schema schema,
byte[] buffer)
Read CSV formatted data using the default CSVOptions.
|
static Table |
readCSV(Schema schema,
CSVOptions opts,
byte[] buffer)
Read CSV formatted data.
|
static Table |
readCSV(Schema schema,
CSVOptions opts,
byte[] buffer,
long offset,
long len)
Read CSV formatted data.
|
static Table |
readCSV(Schema schema,
CSVOptions opts,
File path)
Read a CSV file.
|
static Table |
readCSV(Schema schema,
CSVOptions opts,
HostMemoryBuffer buffer,
long offset,
long len)
Read CSV formatted data.
|
static Table |
readCSV(Schema schema,
File path)
Read a CSV file using the default CSVOptions.
|
static Table |
readORC(byte[] buffer)
Read ORC formatted data.
|
static Table |
readORC(File path)
Read a ORC file using the default ORCOptions.
|
static Table |
readORC(ORCOptions opts,
byte[] buffer)
Read ORC formatted data.
|
static Table |
readORC(ORCOptions opts,
byte[] buffer,
long offset,
long len)
Read ORC formatted data.
|
static Table |
readORC(ORCOptions opts,
File path)
Read a ORC file.
|
static Table |
readORC(ORCOptions opts,
HostMemoryBuffer buffer,
long offset,
long len)
Read ORC formatted data.
|
static Table |
readParquet(byte[] buffer)
Read parquet formatted data.
|
static Table |
readParquet(File path)
Read a Parquet file using the default ParquetOptions.
|
static Table |
readParquet(ParquetOptions opts,
byte[] buffer)
Read parquet formatted data.
|
static Table |
readParquet(ParquetOptions opts,
byte[] buffer,
long offset,
long len)
Read parquet formatted data.
|
static Table |
readParquet(ParquetOptions opts,
File path)
Read a Parquet file.
|
static Table |
readParquet(ParquetOptions opts,
HostMemoryBuffer buffer,
long offset,
long len)
Read parquet formatted data.
|
static Aggregate |
sum(int index) |
String |
toString() |
public Table(ColumnVector... columns)
ColumnVectors passed to it. The class
will decrease the refcount
on itself and all its contents when closed and free resources if refcount is zerocolumns - - Array of ColumnVectorspublic ColumnVector getColumn(int index)
ColumnVector at the specified index. If you want to keep a reference to
the column around past the life time of the table, you will need to increment the reference
count on the column yourself.public final long getRowCount()
public final int getNumberOfColumns()
public void close()
close in interface AutoCloseablepublic static Table readCSV(Schema schema, File path)
schema - the schema of the file. You may use Schema.INFERRED to infer the schema.path - the local file to read.public static Table readCSV(Schema schema, CSVOptions opts, File path)
schema - the schema of the file. You may use Schema.INFERRED to infer the schema.opts - various CSV parsing options.path - the local file to read.public static Table readCSV(Schema schema, byte[] buffer)
schema - the schema of the data. You may use Schema.INFERRED to infer the schema.buffer - raw UTF8 formatted bytes.public static Table readCSV(Schema schema, CSVOptions opts, byte[] buffer)
schema - the schema of the data. You may use Schema.INFERRED to infer the schema.opts - various CSV parsing options.buffer - raw UTF8 formatted bytes.public static Table readCSV(Schema schema, CSVOptions opts, byte[] buffer, long offset, long len)
schema - the schema of the data. You may use Schema.INFERRED to infer the schema.opts - various CSV parsing options.buffer - raw UTF8 formatted bytes.offset - the starting offset into buffer.len - the number of bytes to parse.public static Table readCSV(Schema schema, CSVOptions opts, HostMemoryBuffer buffer, long offset, long len)
schema - the schema of the data. You may use Schema.INFERRED to infer the schema.opts - various CSV parsing options.buffer - raw UTF8 formatted bytes.offset - the starting offset into buffer.len - the number of bytes to parse.public static Table readParquet(File path)
path - the local file to read.public static Table readParquet(ParquetOptions opts, File path)
opts - various parquet parsing options.path - the local file to read.public static Table readParquet(byte[] buffer)
buffer - raw parquet formatted bytes.public static Table readParquet(ParquetOptions opts, byte[] buffer)
opts - various parquet parsing options.buffer - raw parquet formatted bytes.public static Table readParquet(ParquetOptions opts, byte[] buffer, long offset, long len)
opts - various parquet parsing options.buffer - raw parquet formatted bytes.offset - the starting offset into buffer.len - the number of bytes to parse.public static Table readParquet(ParquetOptions opts, HostMemoryBuffer buffer, long offset, long len)
opts - various parquet parsing options.buffer - raw parquet formatted bytes.offset - the starting offset into buffer.len - the number of bytes to parse.public static Table readORC(File path)
path - the local file to read.public static Table readORC(ORCOptions opts, File path)
opts - ORC parsing options.path - the local file to read.public static Table readORC(byte[] buffer)
buffer - raw ORC formatted bytes.public static Table readORC(ORCOptions opts, byte[] buffer)
opts - various ORC parsing options.buffer - raw ORC formatted bytes.public static Table readORC(ORCOptions opts, byte[] buffer, long offset, long len)
opts - various ORC parsing options.buffer - raw ORC formatted bytes.offset - the starting offset into buffer.len - the number of bytes to parse.public static Table readORC(ORCOptions opts, HostMemoryBuffer buffer, long offset, long len)
opts - various ORC parsing options.buffer - raw ORC formatted bytes.offset - the starting offset into buffer.len - the number of bytes to parse.public static Table concatenate(Table... tables)
public Table orderBy(boolean areNullsSmallest, Table.OrderByArg... args)
ColumnVector returned as part of the output Table
Example usage: orderBy(true, Table.asc(0), Table.desc(3)...);
areNullsSmallest - - represents if nulls are to be considered smaller than non-nulls.args - - Suppliers to initialize sortKeys.public static Table.OrderByArg asc(int index)
public static Table.OrderByArg desc(int index)
public static Aggregate count(int index)
public static Aggregate max(int index)
public static Aggregate min(int index)
public static Aggregate sum(int index)
public static Aggregate mean(int index)
public Table.AggregateOperation groupBy(GroupByOptions groupByOptions, int... indices)
public Table.AggregateOperation groupBy(int... indices)
public Table.TableOperation onColumns(int... indices)
public Table filter(ColumnVector mask)
Given a mask column, each element `i` from the input columns is copied to the output columns if the corresponding element `i` in the mask is non-null and `true`. This operation is stable: the input order is preserved.
This table and mask columns must have the same number of rows.
The output table has size equal to the number of elements in boolean_mask that are both non-null and `true`.
If the original table row count is zero, there is no error, and an empty table is returned.
mask - column of type DType.BOOL8 used as a mask to filter
the input columnCopyright © 2019. All rights reserved.