public final class ByteQuadsCanonicalizerextends Object
Replacement for BytesToNameCanonicalizer which aims at more localized
memory access due to flattening of name quad data.
Performance improvement modest for simple JSON document data binding (maybe 3%),
but should help more for larger symbol tables, or for binary formats like Smile.
Hash area is divided into 4 sections:
Primary area (1/2 of total size), direct match from hash (LSB)
Secondary area (1/4 of total size), match from hash (LSB) >> 1
Tertiary area (1/8 of total size), match from hash (LSB) >> 2
Spill-over area (remaining 1/8) with linear scan, insertion order
and within every area, entries are 4 ints, where 1 - 3 ints contain 1 - 12
UTF-8 encoded bytes of name (null-padded), and last int is offset in
_names that contains actual name Strings.
Actual canonicalizer instance that can be used by a parser if (and only if)
canonicalization is enabled; otherwise a non-null "placeholder" instance.
Since:
2.13
release
publicvoidrelease()
Method called by the using code to indicate it is done with this instance.
This lets instance merge accumulated changes into parent (if need be),
safely and efficiently, and without calling code having to know about parent
information.
size
publicintsize()
Returns:
Number of symbol entries contained by this canonicalizer instance
bucketCount
publicintbucketCount()
Returns:
number of primary slots table has currently
maybeDirty
publicbooleanmaybeDirty()
Method called to check to quickly see if a child symbol table
may have gotten additional entries. Used for checking to see
if a child table should be merged into shared table.
Returns:
Whether main hash area has been modified
hashSeed
publicinthashSeed()
isCanonicalizing
publicbooleanisCanonicalizing()
Returns:
True for "real", canonicalizing child tables; false for
root table as well as placeholder "child" tables.
Since:
2.13
primaryCount
publicintprimaryCount()
Method mostly needed by unit tests; calculates number of
entries that are in the primary slot set. These are
"perfect" entries, accessible with a single lookup
Returns:
Number of entries in the primary hash area
secondaryCount
publicintsecondaryCount()
Method mostly needed by unit tests; calculates number of entries
in secondary buckets
Returns:
Number of entries in the secondary hash area
tertiaryCount
publicinttertiaryCount()
Method mostly needed by unit tests; calculates number of entries
in tertiary buckets
Returns:
Number of entries in the tertiary hash area
spilloverCount
publicintspilloverCount()
Method mostly needed by unit tests; calculates number of entries
in shared spill-over area