Package io.github.jbellis.jvector.pq
Class ProductQuantization
java.lang.Object
io.github.jbellis.jvector.pq.ProductQuantization
A Product Quantization implementation for float vectors.
-
Method Summary
Modifier and TypeMethodDescriptionstatic ProductQuantizationcompute(RandomAccessVectorValues<float[]> ravv, int M, boolean globallyCenter) Initializes the codebooks by clustering the input data using Product Quantization.voiddecode(byte[] encoded, float[] target) Decodes the quantized representation (byte array) to its approximate original vector.floatdecodedCosine(byte[] encoded, float[] other) Computes the cosine of the (approximate) original decoded vector with another vector.byte[]encode(float[] vector) Encodes the input vector using the PQ codebooks.byte[][]Encodes the given vectors in parallel using the PQ codebooks.booleanfloat[]intintinthashCode()static ProductQuantizationlongvoidwrite(DataOutput out)
-
Method Details
-
compute
public static ProductQuantization compute(RandomAccessVectorValues<float[]> ravv, int M, boolean globallyCenter) Initializes the codebooks by clustering the input data using Product Quantization.- Parameters:
ravv- the vectors to quantizeM- number of subspacesgloballyCenter- whether to center the vectors globally before quantization (not recommended when using the quantization for dot product)
-
encodeAll
Encodes the given vectors in parallel using the PQ codebooks. -
encode
public byte[] encode(float[] vector) Encodes the input vector using the PQ codebooks.- Returns:
- one byte per subspace
-
decodedCosine
public float decodedCosine(byte[] encoded, float[] other) Computes the cosine of the (approximate) original decoded vector with another vector.This method can compute the cosine without materializing the decoded vector as a new float[], which will be roughly 1.5x as fast as decode() + dot().
It is the caller's responsibility to center the `other` vector by subtracting the global centroid before calling this method.
-
decode
public void decode(byte[] encoded, float[] target) Decodes the quantized representation (byte array) to its approximate original vector. -
getOriginalDimension
public int getOriginalDimension()- Returns:
- The dimension of the vectors being quantized.
-
getSubspaceCount
public int getSubspaceCount()- Returns:
- how many bytes we are compressing to
-
write
- Throws:
IOException
-
load
- Throws:
IOException
-
equals
-
hashCode
public int hashCode() -
getCenter
public float[] getCenter() -
memorySize
public long memorySize()
-