Interface TermToBytesRefAttribute
- All Superinterfaces:
Attribute
- All Known Implementing Classes:
CharTermAttributeImpl,CollatedTermAttributeImpl,NumericTokenStream.NumericTermAttributeImpl,Token
This attribute is requested by TermsHashPerField to index the contents.
This attribute can be used to customize the final byte[] encoding of terms.
Consumers of this attribute call getBytesRef() up-front, and then
invoke fillBytesRef() for each term. Example:
final TermToBytesRefAttribute termAtt = tokenStream.getAttribute(TermToBytesRefAttribute.class);
final BytesRef bytes = termAtt.getBytesRef();
while (tokenStream.incrementToken() {
// you must call termAtt.fillBytesRef() before doing something with the bytes.
// this encodes the term value (internally it might be a char[], etc) into the bytes.
int hashCode = termAtt.fillBytesRef();
if (isInteresting(bytes)) {
// because the bytes are reused by the attribute (like CharTermAttribute's char[] buffer),
// you should make a copy if you need persistent access to the bytes, otherwise they will
// be rewritten across calls to incrementToken()
doSomethingWith(new BytesRef(bytes));
}
}
...
-
Method Summary
Modifier and TypeMethodDescriptionintUpdates the bytesgetBytesRef()to contain this term's final encoding, and returns its hashcode.Retrieve this attribute's BytesRef.
-
Method Details
-
fillBytesRef
int fillBytesRef()Updates the bytesgetBytesRef()to contain this term's final encoding, and returns its hashcode.- Returns:
- the hashcode as defined by
BytesRef.hashCode():int hash = 0; for (int i = termBytes.offset; i < termBytes.offset+termBytes.length; i++) { hash = 31*hash + termBytes.bytes[i]; }Implement this for performance reasons, if your code can calculate the hash on-the-fly. If this is not the case, just returntermBytes.hashCode().
-
getBytesRef
BytesRef getBytesRef()Retrieve this attribute's BytesRef. The bytes are updated from the current term when the consumer callsfillBytesRef().- Returns:
- this Attributes internal BytesRef.
-