public class ZenoString extends UnicodeString
The segments will always be non-empty. An empty string contains no segments.
The key to the performance of the data structure (and its name) is the algorithm for consolidating segments when strings are concatenated, so as to keep the number of segments increasing logarithmically with the string size, with short segments at the extremities to allow efficient further concatenation at the ends.
For further details see the paper by Michael Kay at Balisage 2021.
| Modifier and Type | Field and Description |
|---|---|
static ZenoString |
EMPTY
An empty ZenoString
|
| Modifier and Type | Method and Description |
|---|---|
int |
codePointAt(long index)
Get the code point at a given position in the string
|
IntIterator |
codePoints()
Get an iterator over the code points present in the string.
|
ZenoString |
concat(UnicodeString other)
Concatenate another string
|
static UnicodeString |
concatSegments(UnicodeString left,
UnicodeString right) |
java.util.List<java.lang.Long> |
debugSegmentLengths()
This method is for diagnostics and unit testing only: it exposes
the lengths of the internal segments.
|
UnicodeString |
economize()
Get an equivalent UnicodeString that uses the most economical representation available
|
int |
getWidth()
Get the number of bits needed to hold all the characters in this string
|
long |
indexOf(int codePoint,
long from)
Get the position of the first occurrence of the specified codepoint,
starting the search at a given position in the string
|
long |
indexWhere(java.util.function.IntPredicate predicate,
long from)
Get the position of the first occurrence of a codepoint that matches a supplied predicate,
starting the search at a given position in the string
|
boolean |
isEmpty()
Ask whether the string is empty
|
long |
length()
Get the length of the string
|
static ZenoString |
of(UnicodeString content)
Construct a ZenoString from a supplied UnicodeString
|
UnicodeString |
substring(long start,
long end)
Get a substring of this codepoint sequence, with a given start and end position
|
java.lang.String |
toString() |
void |
writeSegments(UnicodeWriter writer)
Write each of the segments in turn to a UnicodeWriter
|
asAtomic, checkSubstringBounds, compareTo, equals, estimatedLength, hashCode, hasSubstring, indexOf, indexOf, length32, prefix, requireInt, substring, tidy, verifyCharacterspublic static final ZenoString EMPTY
public static ZenoString of(UnicodeString content)
content - the supplied UnicodeStringpublic IntIterator codePoints()
codePoints in class UnicodeStringpublic long length()
length in class UnicodeStringpublic boolean isEmpty()
isEmpty in class UnicodeStringpublic int getWidth()
getWidth in class UnicodeStringpublic long indexOf(int codePoint,
long from)
indexOf in class UnicodeStringcodePoint - the sought codePointfrom - the position from which the search should start (0-based), in the
range 0 to length()-1java.lang.IndexOutOfBoundsException - if the from value is out of rangepublic long indexWhere(java.util.function.IntPredicate predicate,
long from)
UnicodeStringindexWhere in class UnicodeStringpredicate - condition that the codepoint must satisfyfrom - the position from which the search should start (0-based)public int codePointAt(long index)
codePointAt in class UnicodeStringindex - the given position (0-based)java.lang.IndexOutOfBoundsException - if the index is out of rangepublic UnicodeString substring(long start, long end)
substring in class UnicodeStringstart - the start position (0-based): that is, the position of the first
code point to be includedend - the end position (0-based): specifically, the position of the first
code point not to be includedpublic ZenoString concat(UnicodeString other)
concat in class UnicodeStringother - the string to be appended to this onepublic void writeSegments(UnicodeWriter writer) throws java.io.IOException
writer - the writer to which the string is to be writtenjava.io.IOExceptionpublic static UnicodeString concatSegments(UnicodeString left, UnicodeString right)
public UnicodeString economize()
economize in class UnicodeStringpublic java.lang.String toString()
toString in class java.lang.Objectpublic java.util.List<java.lang.Long> debugSegmentLengths()
Copyright (c) 2004-2022 Saxonica Limited. All rights reserved.