public class GenericIndexed<T> extends Object implements CloseableIndexed<T>, Serializer
V1 Storage Format:
byte 1: version (0x1) byte 2 == 0x1 =>; allowReverseLookup bytes 3-6 =>; numBytesUsed bytes 7-10 =>; numElements bytes 10-((numElements * 4) + 10): integers representing *end* offsets of byte serialized values bytes ((numElements * 4) + 10)-(numBytesUsed + 2): 4-byte integer representing length of value, followed by bytes for value. Length of value stored has no meaning, if next offset is strictly greater than the current offset, and if they are the same, -1 at this field means null, and 0 at this field means some object (potentially non-null - e. g. in the string case, that is serialized as an empty sequence of bytes).
V2 Storage Format Meta, header and value files are separate and header file stored in native endian byte order. Meta File: byte 1: version (0x2) byte 2 == 0x1 =>; allowReverseLookup bytes 3-6: numberOfElementsPerValueFile expressed as power of 2. That means all the value files contains same number of items except last value file and may have fewer elements. bytes 7-10 =>; numElements bytes 11-14 =>; columnNameLength bytes 15-columnNameLength =>; columnName
Header file name is identified as: StringUtils.format("%s_header", columnName)
value files are identified as: StringUtils.format("%s_value_%d", columnName, fileNumber)
number of value files == numElements/numberOfElementsPerValueFile
The version EncodedStringDictionaryWriter.VERSION is reserved and must never be specified as the
GenericIndexed version byte, else it will interfere with string column deserialization.
GenericIndexedWriter| Modifier and Type | Class and Description |
|---|---|
class |
GenericIndexed.BufferIndexed
Single-threaded view.
|
| Modifier and Type | Field and Description |
|---|---|
static ObjectStrategy<String> |
STRING_STRATEGY |
static ObjectStrategy<ByteBuffer> |
UTF8_STRATEGY
An ObjectStrategy that returns a big-endian ByteBuffer pointing to original data.
|
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
static <T> GenericIndexed<T> |
fromArray(T[] objects,
ObjectStrategy<T> strategy) |
static <T> GenericIndexed<T> |
fromIterable(Iterable<T> objectsIterable,
ObjectStrategy<T> strategy) |
T |
get(int index)
Get the value at specified position
|
Class<? extends T> |
getClazz() |
long |
getSerializedSize()
Returns the number of bytes, that this Serializer will write to the output _channel_ (not smoosher) on a
Serializer.writeTo(java.nio.channels.WritableByteChannel, org.apache.druid.java.util.common.io.smoosh.FileSmoosher) call. |
int |
indexOf(T value)
Returns the index of "value" in this GenericIndexed object, or (-(insertion point) - 1) if the value is not
present, in the manner of Arrays.binarySearch.
|
void |
inspectRuntimeShape(RuntimeShapeInspector inspector)
Implementations of this method should call
inspector.visit() with all fields of this class, which meet two
conditions:
1. |
boolean |
isSorted()
Indicates if this value set is sorted, the implication being that the contract of
Indexed.indexOf(T) is strenthened
to return a negative number equal to (-(insertion point) - 1) when the value is not present in the set. |
Iterator<T> |
iterator() |
static GenericIndexed<ResourceHolder<ByteBuffer>> |
ofCompressedByteBuffers(Iterable<ByteBuffer> buffers,
CompressionStrategy compression,
int bufferSize,
ByteOrder order,
Closer closer) |
static <T> GenericIndexed<T> |
read(ByteBuffer buffer,
ObjectStrategy<T> strategy) |
static <T> GenericIndexed<T> |
read(ByteBuffer buffer,
ObjectStrategy<T> strategy,
SmooshedFileMapper fileMapper) |
GenericIndexed.BufferIndexed |
singleThreaded()
Create a non-thread-safe Indexed, which may perform better than the underlying Indexed.
|
int |
size()
Number of elements in the value set
|
String |
toString() |
void |
writeTo(WritableByteChannel channel,
FileSmoosher smoosher)
Writes serialized form of this object to the given channel.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitcheckIndexforEach, spliteratorpublic static final ObjectStrategy<ByteBuffer> UTF8_STRATEGY
StringUtils.compareUtf8UsingJavaStringOrdering(byte[], byte[])
so that behavior is consistent with STRING_STRATEGY.public static final ObjectStrategy<String> STRING_STRATEGY
public static <T> GenericIndexed<T> read(ByteBuffer buffer, ObjectStrategy<T> strategy)
public static <T> GenericIndexed<T> read(ByteBuffer buffer, ObjectStrategy<T> strategy, SmooshedFileMapper fileMapper)
public static <T> GenericIndexed<T> fromArray(T[] objects, ObjectStrategy<T> strategy)
public static GenericIndexed<ResourceHolder<ByteBuffer>> ofCompressedByteBuffers(Iterable<ByteBuffer> buffers, CompressionStrategy compression, int bufferSize, ByteOrder order, Closer closer)
public static <T> GenericIndexed<T> fromIterable(Iterable<T> objectsIterable, ObjectStrategy<T> strategy)
public int size()
Indexedpublic T get(int index)
Indexedpublic int indexOf(@Nullable T value)
public boolean isSorted()
IndexedIndexed.indexOf(T) is strenthened
to return a negative number equal to (-(insertion point) - 1) when the value is not present in the set.public long getSerializedSize()
SerializerSerializer.writeTo(java.nio.channels.WritableByteChannel, org.apache.druid.java.util.common.io.smoosh.FileSmoosher) call.getSerializedSize in interface Serializerpublic void writeTo(WritableByteChannel channel, FileSmoosher smoosher) throws IOException
SerializerwriteTo in interface SerializerIOExceptionpublic GenericIndexed.BufferIndexed singleThreaded()
public void inspectRuntimeShape(RuntimeShapeInspector inspector)
HotLoopCalleeinspector.visit() with all fields of this class, which meet two
conditions:
1. They are used in methods of this class, annotated with CalledFromHotLoop
2. They are either:
a. Nullable objects
b. Instances of HotLoopCallee
c. Objects, which don't always have a specific class in runtime. For example, a field of type Set could be HashSet or TreeSet in runtime, depending on how
this instance (the instance on which inspectRuntimeShape() is called) is configured.
d. ByteBuffer or similar objects, where byte order matters
e. boolean flags, affecting branch taking
f. Arrays of objects, meeting any of conditions a-e.inspectRuntimeShape in interface HotLoopCalleepublic void close()
close in interface Closeableclose in interface AutoCloseableCopyright © 2011–2022 The Apache Software Foundation. All rights reserved.