public class NFSFileVec extends ByteVec
Vec will be lazily loaded from the NFS file on-demand. Each machine is expected to have the same filesystem view onto a file with the same byte contents. Each machine will lazily load only the sections of the file that are assigned to that machine. Basically, the file starts striped across some globally visible file system (e.g. NFS, or just replicated on local disk) and is loaded into memory - again striped across the machines - without any network traffic or data-motion.
Useful to "memory map" into RAM large datafiles, often pure text files.
Vec.CollectDomain, Vec.VectorGroup| Modifier and Type | Field and Description |
|---|---|
static int |
CHUNK_SZ |
KEY_PREFIX_LEN, PERCENTILES, T_BAD, T_ENUM, T_NUM, T_STR, T_TIME, T_TIMELAST, T_UUID| Modifier and Type | Method and Description |
|---|---|
long |
byteSize()
Size of vector data.
|
protected Value |
chunkIdx(int cidx)
Get a Chunk's Value by index.
|
static long |
chunkOffset(Key ckey)
Convert a chunk-key to a file offset.
|
long |
length()
Number of elements in the vector; returned as a
long instead of
an int because Vecs support more than 2^32 elements. |
static NFSFileVec |
make(java.io.File f)
Make a new NFSFileVec key which holds the filename implicitly.
|
static NFSFileVec |
make(java.io.File f,
Futures fs)
Make a new NFSFileVec key which holds the filename implicitly.
|
int |
nChunks()
Number of chunks, returned as an
int - Chunk count is limited by
the max size of a Java long[]. |
boolean |
writable()
Default read/write behavior for Vecs.
|
chunkForChunkIdx, getFirstBytes, isInt, naCnt, openStreamalign, at, at16h, at16l, at8, atStr, base, bins, cardinality, checksum, chunkForRow, chunkKey, chunkKey, domain, equals, factor, getVecKey, group, hashCode, isBad, isConst, isEnum, isNA, isNumeric, isString, isTime, isUUID, lazy_bins, makeCon, makeCon, makeCopy, makeRepSeq, makeSeq, makeSimpleTransf, makeTransf, makeTransf, makeVec, makeVec, makeVec, makeZero, makeZero, makeZero, makeZeros, max, maxs, mean, min, mins, newKey, ninfs, nzCnt, open, pctiles, pinfs, postWrite, preWriting, remove_impl, set, set, set, set, setDomain, sigma, startRollupStats, startRollupStats, stride, timeParse, toEnum, toStringclone, frozenType, read_impl, read, readExternal, readJSON_impl, readJSON, write_impl, write, writeExternal, writeHTML_impl, writeHTML, writeJSON_impl, writeJSONpublic static final int CHUNK_SZ
public static NFSFileVec make(java.io.File f)
public static NFSFileVec make(java.io.File f, Futures fs)
public long length()
Veclong instead of
an int because Vecs support more than 2^32 elements. Overridden
by subclasses that compute length in an alternative way, such as
file-backed Vecs.public int nChunks()
Vecint - Chunk count is limited by
the max size of a Java long[]. Overridden by subclasses that
compute chunks in an alternative way, such as file-backed Vecs.public boolean writable()
Vecpublic long byteSize()
public static long chunkOffset(Key ckey)