public final class UCharacterProperty extends Object
Internal class used for Unicode character property database.
This classes store binary data read from uprops.icu. It does not have the capability to parse the data into more high-level information. It only returns bytes of information when required.
Due to the form most commonly used for retrieval, array of char is used to store the binary data.
UCharacterPropertyDB also contains information on accessing indexes to significant points in the binary data.
Responsibility for molding the binary data into more meaning form lies on UCharacter.
| Modifier and Type | Field and Description |
|---|---|
static UCharacterProperty |
INSTANCE |
static char |
LATIN_CAPITAL_LETTER_I_WITH_DOT_ABOVE_
Latin capital letter i with dot above
|
static char |
LATIN_SMALL_LETTER_DOTLESS_I_
Latin small letter i with dot above
|
static char |
LATIN_SMALL_LETTER_I_
Latin lowercase i
|
char[] |
m_scriptExtensions_
Script_Extensions data
|
Trie2_16 |
m_trie_
Trie data
|
VersionInfo |
m_unicodeVersion_
Unicode version
|
static int |
MAX_SCRIPT |
static int |
SCRIPT_HIGH_MASK |
static int |
SCRIPT_HIGH_SHIFT |
static int |
SCRIPT_LOW_MASK
Integer properties mask and shift values for scripts.
|
static int |
SCRIPT_X_MASK
Script_Extensions: mask includes Script
|
static int |
SCRIPT_X_WITH_COMMON |
static int |
SCRIPT_X_WITH_INHERITED |
static int |
SCRIPT_X_WITH_OTHER |
static int |
SRC_BIDI
From ubidi_props.c/ubidi.icu
|
static int |
SRC_CASE
From ucase.c/ucase.icu
|
static int |
SRC_CASE_AND_NORM
From ucase.c/ucase.icu as well as unorm.cpp/unorm.icu
|
static int |
SRC_CHAR
From uchar.c/uprops.icu main trie
|
static int |
SRC_CHAR_AND_PROPSVEC
From uchar.c/uprops.icu main trie as well as properties vectors trie
|
static int |
SRC_COUNT
One more than the highest UPropertySource (SRC_) constant.
|
static int |
SRC_EMOJI |
static int |
SRC_ID_COMPAT_MATH |
static int |
SRC_IDSU |
static int |
SRC_INPC |
static int |
SRC_INSC |
static int |
SRC_NAMES
From unames.c/unames.icu
|
static int |
SRC_NFC
From normalizer2impl.cpp/nfc.nrm
|
static int |
SRC_NFC_CANON_ITER
From normalizer2impl.cpp/nfc.nrm canonical iterator data
|
static int |
SRC_NFKC
From normalizer2impl.cpp/nfkc.nrm
|
static int |
SRC_NFKC_CF
From normalizer2impl.cpp/nfkc_cf.nrm
|
static int |
SRC_NONE
No source, not a supported property.
|
static int |
SRC_PROPSVEC
From uchar.c/uprops.icu properties vectors trie
|
static int |
SRC_VO |
static int |
TYPE_MASK
Character type mask
|
| Modifier and Type | Method and Description |
|---|---|
UnicodeSet |
addPropertyStarts(UnicodeSet set) |
int |
digit(int c) |
int |
getAdditional(int codepoint,
int column)
Gets the unicode additional properties.
|
VersionInfo |
getAge(int codepoint)
Get the "age" of the code point.
|
static int |
getEuropeanDigit(int ch)
Returns the digit values of characters like 'A' - 'Z', normal,
half-width and full-width.
|
int |
getIntPropertyMaxValue(int which) |
int |
getIntPropertyValue(int c,
int which) |
static int |
getMask(int type)
Gets the type mask
|
int |
getMaxValues(int column)
Get the the maximum values for some enum/int properties.
|
int |
getNumericValue(int c) |
int |
getProperty(int ch)
Gets the main property value for code point ch.
|
int |
getType(int c) |
double |
getUnicodeNumericValue(int c) |
boolean |
hasBinaryProperty(int c,
int which) |
static int |
mergeScriptCodeOrIndex(int scriptX) |
void |
upropsvec_addPropertyStarts(UnicodeSet set) |
public static final UCharacterProperty INSTANCE
public Trie2_16 m_trie_
public VersionInfo m_unicodeVersion_
public static final char LATIN_CAPITAL_LETTER_I_WITH_DOT_ABOVE_
public static final char LATIN_SMALL_LETTER_DOTLESS_I_
public static final char LATIN_SMALL_LETTER_I_
public static final int TYPE_MASK
public static final int SRC_NONE
public static final int SRC_CHAR
public static final int SRC_PROPSVEC
public static final int SRC_NAMES
public static final int SRC_CASE
public static final int SRC_BIDI
public static final int SRC_CHAR_AND_PROPSVEC
public static final int SRC_CASE_AND_NORM
public static final int SRC_NFC
public static final int SRC_NFKC
public static final int SRC_NFKC_CF
public static final int SRC_NFC_CANON_ITER
public static final int SRC_INPC
public static final int SRC_INSC
public static final int SRC_VO
public static final int SRC_EMOJI
public static final int SRC_IDSU
public static final int SRC_ID_COMPAT_MATH
public static final int SRC_COUNT
public char[] m_scriptExtensions_
public static final int SCRIPT_X_MASK
public static final int SCRIPT_HIGH_MASK
public static final int SCRIPT_HIGH_SHIFT
public static final int MAX_SCRIPT
public static final int SCRIPT_LOW_MASK
public static final int SCRIPT_X_WITH_COMMON
public static final int SCRIPT_X_WITH_INHERITED
public static final int SCRIPT_X_WITH_OTHER
public final int getProperty(int ch)
ch - code point whose property value is to be retrievedpublic int getAdditional(int codepoint,
int column)
codepoint - codepoint whose additional properties is to be
retrievedcolumn - The column index.public VersionInfo getAge(int codepoint)
Get the "age" of the code point.
The "age" is the Unicode version when the code point was first designated (as a non-character or for Private Use) or assigned a character.
This can be useful to avoid emitting code points to receiving processes that do not accept newer characters.
The data is from the UCD file DerivedAge.txt.
This API does not check the validity of the codepoint.
codepoint - The code point.public boolean hasBinaryProperty(int c,
int which)
public int getType(int c)
public int getIntPropertyValue(int c,
int which)
public int getIntPropertyMaxValue(int which)
public int getMaxValues(int column)
public static final int getMask(int type)
type - character typepublic static int getEuropeanDigit(int ch)
ch - character to testpublic int digit(int c)
public int getNumericValue(int c)
public double getUnicodeNumericValue(int c)
public static final int mergeScriptCodeOrIndex(int scriptX)
public UnicodeSet addPropertyStarts(UnicodeSet set)
public void upropsvec_addPropertyStarts(UnicodeSet set)