UnicodeString interface, which represents a string as a sequence of directly-addressible
Unicode codepoints (without relying on surrogate pairs).See: Description
| Interface | Description |
|---|---|
| TwineConsumer |
Interface that accepts a a sequence of Unicode codepoints.
|
| UnicodeWriter |
Interface that accepts strings in the form of
UnicodeString objects,
which are written to some destination. |
| UniStringConsumer |
Interface that accepts a string in the form of a sequence of CharSequences,
which are conceptually concatenated (though in some implementations, the final
string may never be materialized in memory)
|
| Class | Description |
|---|---|
| AbstractUniStringConsumer |
This abstract implementation of UniStringConsumer exists largely for C#, as a place to
capture the default methods defined in the interface, and avoid them proliferating into
multiple subclasses
|
| BMPString |
An implementation of
UnicodeString that wraps a Java string which is known to contain
no surrogates. |
| CodepointIterator |
Iterator over a string to produce a sequence of single character strings
|
| CompressedWhitespace |
This class provides a compressed representation of a sequence of whitespace characters.
|
| EmptyUnicodeString |
A zero-length Unicode string
|
| IndentWhitespace |
This class provides a compressed representation of a string used to represent indentation: specifically,
an integer number of newlines followed by an integer number of spaces.
|
| LargeTextBuffer |
The segments (other than the last) have a fixed size of 65536 codepoints,
which may use one byte per codepoint, two bytes per codepoint, or three bytes per
codepoint, depending on the largest codepoint present in the segment.
|
| Slice16 |
A Unicode string consisting entirely of 16-bit BMP characters, implemented as a range
of an underlying byte array
|
| Slice24 |
A Unicode string consisting of 24-bit characters, implemented as a range
of an underlying byte array holding three bytes per codepoint
|
| Slice8 |
A Unicode string consisting entirely of 8-bit characters, implemented as a range
of an underlying byte array
|
| StringConstants |
Contains constants representing some frequently used strings, either as a
UnicodeString
or in some cases as a byte array. |
| StringTool | |
| StringView |
An implementation of the CodePoints interface that wraps an ordinary Java string.
|
| ToLower |
Class to perform lowercase conversion.
|
| ToUpper |
Class to perform uppercase conversion.
|
| Twine16 |
Twine16 is a Unicode string consisting entirely of codepoints in the range 0-65535
(that is, the basic multilingual plane), excluding surrogates. |
| Twine24 |
Twine24 is Unicode string that accommodates any codepoint value up to 24 bits. |
| Twine8 |
Twine8 is Unicode string whose codepoints are all in the range 0-255 (that is, Latin-1). |
| UnicodeBuilder |
Builder class to construct a UnicodeString by appending text incrementally
|
| UnicodeChar |
A UnicodeString containing a single codepoint
|
| UnicodeString |
A UnicodeString is a sequence of Unicode codepoints that supports codepoint addressing.
|
| UnicodeWriterToWriter |
Implementation of
UnicodeWriter that converts Unicode strings to ordinary
Java strings and sends them to a supplied Writer |
| WhitespaceString |
This abstract class represents a couple of different implementations of strings
containing whitespace only.
|
| ZenoString |
A ZenoString is an implementation of UnicodeString that comprises a list
of segments representing substrings of the total string.
|
This package contains classes used to handle Unicode strings: notably implementations of the
UnicodeString interface, which represents a string as a sequence of directly-addressible
Unicode codepoints (without relying on surrogate pairs).
Copyright (c) 2004-2022 Saxonica Limited. All rights reserved.