Class Normalizer2Impl

java.lang.Object
org.graalvm.shadowed.com.ibm.icu.impl.Normalizer2Impl

public final class Normalizer2Impl extends Object
Low-level implementation of the Unicode Normalization Algorithm. For the data structure and details see the documentation at the end of C++ normalizer2impl.h and in the design doc at https://unicode-org.github.io/icu/design/normalization/custom.html
  • Field Details

  • Constructor Details

    • Normalizer2Impl

      public Normalizer2Impl()
  • Method Details

    • load

      public Normalizer2Impl load(ByteBuffer bytes)
    • load

      public Normalizer2Impl load(String name)
    • addLcccChars

      public void addLcccChars(UnicodeSet set)
    • addPropertyStarts

      public void addPropertyStarts(UnicodeSet set)
    • addCanonIterPropertyStarts

      public void addCanonIterPropertyStarts(UnicodeSet set)
    • ensureCanonIterData

      public Normalizer2Impl ensureCanonIterData()
      Builds the canonical-iterator data for this instance. This is required before any of isCanonSegmentStarter(int) or getCanonStartSet(int, UnicodeSet) are called, or else they crash.
      Returns:
      this
    • getNorm16

      public int getNorm16(int c)
    • getRawNorm16

      public int getRawNorm16(int c)
    • getCompQuickCheck

      public int getCompQuickCheck(int norm16)
    • isAlgorithmicNoNo

      public boolean isAlgorithmicNoNo(int norm16)
    • isCompNo

      public boolean isCompNo(int norm16)
    • isDecompYes

      public boolean isDecompYes(int norm16)
    • getCC

      public int getCC(int norm16)
    • getCCFromNormalYesOrMaybe

      public static int getCCFromNormalYesOrMaybe(int norm16)
    • getCCFromYesOrMaybeYes

      public static int getCCFromYesOrMaybeYes(int norm16)
    • getCCFromYesOrMaybeYesCP

      public int getCCFromYesOrMaybeYesCP(int c)
    • getFCD16

      public int getFCD16(int c)
      Returns the FCD data for code point c.
      Parameters:
      c - A Unicode code point.
      Returns:
      The lccc(c) in bits 15..8 and tccc(c) in bits 7..0.
    • singleLeadMightHaveNonZeroFCD16

      public boolean singleLeadMightHaveNonZeroFCD16(int lead)
      Returns true if the single-or-lead code unit c might have non-zero FCD data.
    • getFCD16FromNormData

      public int getFCD16FromNormData(int c)
      Gets the FCD value from the regular normalization data.
    • getDecomposition

      public String getDecomposition(int c)
      Gets the decomposition for one code point.
      Parameters:
      c - code point
      Returns:
      c's decomposition, if it has one; returns null if it does not have a decomposition
    • getRawDecomposition

      public String getRawDecomposition(int c)
      Gets the raw decomposition for one code point.
      Parameters:
      c - code point
      Returns:
      c's raw decomposition, if it has one; returns null if it does not have a decomposition
    • isCanonSegmentStarter

      public boolean isCanonSegmentStarter(int c)
      Returns true if code point c starts a canonical-iterator string segment. ensureCanonIterData() must have been called before this method, or else this method will crash.
      Parameters:
      c - A Unicode code point.
      Returns:
      true if c starts a canonical-iterator string segment.
    • getCanonStartSet

      public boolean getCanonStartSet(int c, UnicodeSet set)
      Returns true if there are characters whose decomposition starts with c. If so, then the set is cleared and then filled with those characters. ensureCanonIterData() must have been called before this method, or else this method will crash.
      Parameters:
      c - A Unicode code point.
      set - A UnicodeSet to receive the characters whose decompositions start with c, if there are any.
      Returns:
      true if there are characters whose decomposition starts with c.
    • decompose

      public Appendable decompose(CharSequence s, StringBuilder dest)
    • decompose

      public void decompose(CharSequence s, int src, int limit, StringBuilder dest, int destLengthEstimate)
      Decomposes s[src, limit[ and writes the result to dest. limit can be NULL if src is NUL-terminated. destLengthEstimate is the initial dest buffer capacity and can be -1.
    • decompose

      public int decompose(CharSequence s, int src, int limit, Normalizer2Impl.ReorderingBuffer buffer)
    • decomposeAndAppend

      public void decomposeAndAppend(CharSequence s, boolean doDecompose, Normalizer2Impl.ReorderingBuffer buffer)
    • compose

      public boolean compose(CharSequence s, int src, int limit, boolean onlyContiguous, boolean doCompose, Normalizer2Impl.ReorderingBuffer buffer)
    • composeQuickCheck

      public int composeQuickCheck(CharSequence s, int src, int limit, boolean onlyContiguous, boolean doSpan)
      Very similar to compose(): Make the same changes in both places if relevant. doSpan: spanQuickCheckYes (ignore bit 0 of the return value) !doSpan: quickCheck
      Returns:
      bits 31..1: spanQuickCheckYes (==s.length() if "yes") and bit 0: set if "maybe"; otherwise, if the span length<s.length() then the quick check result is "no"
    • composeAndAppend

      public void composeAndAppend(CharSequence s, boolean doCompose, boolean onlyContiguous, Normalizer2Impl.ReorderingBuffer buffer)
    • makeFCD

      public int makeFCD(CharSequence s, int src, int limit, Normalizer2Impl.ReorderingBuffer buffer)
    • makeFCDAndAppend

      public void makeFCDAndAppend(CharSequence s, boolean doMakeFCD, Normalizer2Impl.ReorderingBuffer buffer)
    • hasDecompBoundaryBefore

      public boolean hasDecompBoundaryBefore(int c)
    • norm16HasDecompBoundaryBefore

      public boolean norm16HasDecompBoundaryBefore(int norm16)
    • hasDecompBoundaryAfter

      public boolean hasDecompBoundaryAfter(int c)
    • norm16HasDecompBoundaryAfter

      public boolean norm16HasDecompBoundaryAfter(int norm16)
    • isDecompInert

      public boolean isDecompInert(int c)
    • hasCompBoundaryBefore

      public boolean hasCompBoundaryBefore(int c)
    • hasCompBoundaryAfter

      public boolean hasCompBoundaryAfter(int c, boolean onlyContiguous)
    • isCompInert

      public boolean isCompInert(int c, boolean onlyContiguous)
    • hasFCDBoundaryBefore

      public boolean hasFCDBoundaryBefore(int c)
    • hasFCDBoundaryAfter

      public boolean hasFCDBoundaryAfter(int c)
    • isFCDInert

      public boolean isFCDInert(int c)
    • composePair

      public int composePair(int a, int b)