public class JCudfSerialization extends Object
The goal is to transfer data from a local GPU to a remote GPU as quickly and efficiently as possible using build in java communication channels. There is no guarantee of compatibility between different releases of CUDF. This is to allow us to adapt if internal memory layouts and formats change.
This version optimizes for reduced memory transfers, and as such will try to do the fewest number of transfers possible when putting the data back onto the GPU. This means that it will slice a single large memory buffer into smaller buffers used by the resulting ColumnVectors. The downside of this is that generally none of the memory can be released until all of the ColumnVectors are closed. It is assumed that this will not be a problem because for processing efficiency after the data is transferred it will likely be combined with other similar batches from other processes into a single larger buffer.
There is a known bug in this where the null count of a range of values is lost, and replaced with a 0 if it is known that there can be no nulls in the data, or a 1 if there is the possibility of a null being in the data. This is not likely to cause issues if the data is processed using cudf as the null count is only used as a flag to check if a validity buffer is needed or not. Processing outside of cudf should be careful.
| Constructor and Description |
|---|
JCudfSerialization() |
| Modifier and Type | Method and Description |
|---|---|
static long |
getSerializedSizeInBytes(ColumnVector[] columns,
long rowOffset,
long numRows) |
static Table |
readTableFrom(InputStream in)
Read a serialize table from the given InputStream.
|
static void |
writeToStream(ColumnVector[] columns,
OutputStream out,
long rowOffset,
long numRows)
Write all or part of a set of columns out in an internal format.
|
static void |
writeToStream(Table t,
OutputStream out,
long rowOffset,
long numRows)
Write all or part of a table out in an internal format.
|
public static long getSerializedSizeInBytes(ColumnVector[] columns, long rowOffset, long numRows)
public static void writeToStream(Table t, OutputStream out, long rowOffset, long numRows) throws IOException
t - the table to be written.out - the stream to write the serialized table out to.rowOffset - the first row to write out.numRows - the number of rows to write out.IOExceptionpublic static void writeToStream(ColumnVector[] columns, OutputStream out, long rowOffset, long numRows) throws IOException
columns - the columns to be written.out - the stream to write the serialized table out to.rowOffset - the first row to write out.numRows - the number of rows to write out.IOExceptionpublic static Table readTableFrom(InputStream in) throws IOException
in - the stream to read the table data from.IOException - on any error.EOFException - if the data stream ended unexpectedly in the middle of processing.Copyright © 2019. All rights reserved.