Class URIUtility


  • public final class URIUtility
    extends java.lang.Object
    Contains utility methods for processing Uniform Resource Identifiers (URIs) and Internationalized Resource Identifiers (IRIs) under RFC3986 and RFC3987, respectively. In the following documentation, URIs and IRIs include URI references and IRI references, for convenience.

    There are five components to a URI: scheme, authority, path, query, and fragment identifier. The generic syntax to these components is defined in RFC3986 and extended in RFC3987. According to RFC3986, different URI schemes can further restrict the syntax of the authority, path, and query component (see also RFC 7320). However, the syntax of fragment identifiers depends on the media type (also known as MIME type) of the resource a URI references (see also RFC 3986 and RFC 7320). As of September 3, 2019, only the following media types specify a syntax for fragment identifiers:

    • The following application/* media types: epub + zip, pdf, senml + cbor, senml + json, senml-exi, sensml + cbor, sensml + json, sensml-exi, smil, vnd.3gpp-v2x-local-service-information, vnd.3gpp.mcdata-signalling, vnd.collection.doc + json, vnd.hc + json, vnd.hyper + json, vnd.hyper-item + json, vnd.mason + json, vnd.microsoft.portable-executable, vnd.oma.bcast.sgdu, vnd.shootproof + json
    • The following image/* media types: avci, avcs, heic, heic-sequence, heif, heif-sequence, hej2k, hsj2, jxra, jxrs, jxsi, jxss
    • The XML media types: application/xml, application/xml-external-parsed-entity, text/xml, text/xml-external-parsed-entity, application/xml-dtd
    • All media types with subtypes ending in "+xml" (see RFC 7303) use XPointer Framework syntax as fragment identifiers, except the following application/* media types: dicom + xml (syntax not defined), senml + xml (own syntax), sensml + xml (own syntax), ttml + xml (own syntax), xliff + xml (own syntax), yang-data + xml (syntax not defined)
    • font/collection
    • multipart/x-mixed-replace
    • text/plain
    • text/csv
    • text/html
    • text/markdown
    • text/vnd.a
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  URIUtility.ParseMode
      Specifies whether certain characters are allowed when parsing IRIs and URIs.
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.String BuildIRI​(java.lang.String schemeAndAuthority, java.lang.String path, java.lang.String query, java.lang.String fragment)
      Builds an internationalized resource identifier (IRI) from its components.
      static java.lang.String DirectoryPath​(java.lang.String uref)
      Extracts the scheme, the authority, and the path component (up to and including the last "/" in the path if any) from the given URI or IRI, using the IRIStrict parse mode to check the URI or IRI.
      static java.lang.String DirectoryPath​(java.lang.String uref, URIUtility.ParseMode parseMode)
      Extracts the scheme, the authority, and the path component (up to and including the last "/" in the path if any) from the given URI or IRI, using the given parse mode to check the URI or IRI.
      static java.lang.String EncodeStringForURI​(java.lang.String s)
      Encodes characters other than "unreserved" characters for URIs.
      static java.lang.String EscapeURI​(java.lang.String s, int mode)
      Escapes characters that can't appear in URIs or IRIs.
      static boolean HasScheme​(java.lang.String refValue)
      Determines whether the string is a valid IRI with a scheme component.
      static boolean HasSchemeForURI​(java.lang.String refValue)
      Determines whether the string is a valid URI with a scheme component.
      static boolean IsValidCurieReference​(java.lang.String s, int offset, int length)
      Determines whether the substring is a valid CURIE reference under RDFA 1.1.
      static boolean IsValidIRI​(java.lang.String s)
      Returns whether a string is a valid IRI according to the IRIStrict parse mode.
      static boolean IsValidIRI​(java.lang.String s, URIUtility.ParseMode mode)
      Returns whether a string is a valid IRI according to the given parse mode.
      static java.lang.String PercentDecode​(java.lang.String str)
      Decodes percent-encoding (of the form "%XX" where X is a hexadecimal digit) in the given string.
      static java.lang.String PercentDecode​(java.lang.String str, int index, int endIndex)
      Decodes percent-encoding (of the form "%XX" where X is a hexadecimal digit) in the given portion of a string.
      static java.lang.String RelativeResolve​(java.lang.String refValue, java.lang.String absoluteBase)
      Resolves a URI or IRI relative to another URI or IRI.
      static java.lang.String RelativeResolve​(java.lang.String refValue, java.lang.String absoluteBase, URIUtility.ParseMode parseMode)
      Resolves a URI or IRI relative to another URI or IRI.
      static java.lang.String RelativeResolveWithinBaseURI​(java.lang.String refValue, java.lang.String absoluteBase)
      Resolves a URI or IRI relative to another URI or IRI, but only if the resolved URI has no "." or ".." component in its path and only if resolved URI's directory path matches that of the second URI or IRI.
      static int[] SplitIRI​(java.lang.String s)
      Parses an Internationalized Resource Identifier (IRI) reference under RFC3987.
      static int[] SplitIRI​(java.lang.String s, int offset, int length, URIUtility.ParseMode parseMode)
      Parses a substring that represents an Internationalized Resource Identifier (IRI) under RFC3987.
      static int[] SplitIRI​(java.lang.String s, URIUtility.ParseMode parseMode)
      Parses an Internationalized Resource Identifier (IRI) reference under RFC3987.
      static java.lang.String[] SplitIRIToStrings​(java.lang.String s)
      Parses an Internationalized Resource Identifier (IRI) reference under RFC3987.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • EscapeURI

        public static java.lang.String EscapeURI​(java.lang.String s,
                                                 int mode)
        Escapes characters that can't appear in URIs or IRIs. The function is idempotent; that is, calling the function again on the result with the same mode doesn't change the result.
        Parameters:
        s - A string to escape.
        mode - Has the following meaning: 0 = Encode reserved code points, code points below U+0021, code points above U+007E, and square brackets within the authority component, and do the IRISurrogateLenient check. 1 = Encode code points above U+007E, and square brackets within the authority component, and do the IRIStrict check. 2 = Same as 1, except the check is IRISurrogateLenient. 3 = Same as 0, except that percent characters that begin illegal percent-encoding are also encoded.
        Returns:
        A string possibly containing escaped characters, or null if s is null.
      • HasScheme

        public static boolean HasScheme​(java.lang.String refValue)
        Determines whether the string is a valid IRI with a scheme component. This can be used to check for relative IRI references.

        The following cases return true:

        xx-x:mm example:/ww
        The following cases return false:
        x@y:/z /x/y/z example.xyz
        .
        Parameters:
        refValue - A string representing an IRI to check.
        Returns:
        true if the string is a valid IRI with a scheme component; otherwise, false.
      • HasSchemeForURI

        public static boolean HasSchemeForURI​(java.lang.String refValue)
        Determines whether the string is a valid URI with a scheme component. This can be used to check for relative URI references. The following cases return true:
        http://example/z xx-x:mm example:/ww
        The following cases return false:
        x@y:/z /x/y/z example.xyz
        .
        Parameters:
        refValue - A string representing an IRI to check.
        Returns:
        true if the string is a valid URI with a scheme component; otherwise, false.
      • PercentDecode

        public static java.lang.String PercentDecode​(java.lang.String str)
        Decodes percent-encoding (of the form "%XX" where X is a hexadecimal digit) in the given string. Successive percent-encoded bytes are assumed to form characters in UTF-8.
        Parameters:
        str - A string that may contain percent encoding. May be null.
        Returns:
        The string in which percent-encoding was decoded.
      • PercentDecode

        public static java.lang.String PercentDecode​(java.lang.String str,
                                                     int index,
                                                     int endIndex)
        Decodes percent-encoding (of the form "%XX" where X is a hexadecimal digit) in the given portion of a string. Successive percent-encoded bytes are assumed to form characters in UTF-8.
        Parameters:
        str - A string a portion of which may contain percent encoding. May be null.
        index - Index starting at 0 showing where the desired portion of str begins.
        endIndex - Index starting at 0 showing where the desired portion of str ends. The character before this index is the last character.
        Returns:
        The portion of the given string in which percent-encoding was decoded. Returns null if str is ull.
      • EncodeStringForURI

        public static java.lang.String EncodeStringForURI​(java.lang.String s)
        Encodes characters other than "unreserved" characters for URIs.
        Parameters:
        s - A string to encode.
        Returns:
        The encoded string.
        Throws:
        java.lang.NullPointerException - The parameter s is null.
      • IsValidCurieReference

        public static boolean IsValidCurieReference​(java.lang.String s,
                                                    int offset,
                                                    int length)
        Determines whether the substring is a valid CURIE reference under RDFA 1.1. (The CURIE reference is the part after the colon.).
        Parameters:
        s - A string containing a CURIE reference. Can be null.
        offset - A Index starting at 0 showing where the desired portion of "s" begins.
        length - The number of elements in the desired portion of "s" (but not more than "s" 's length).
        Returns:
        true if the substring is a valid CURIE reference under RDFA 1; otherwise, false. Returns false if s is null.
        Throws:
        java.lang.IllegalArgumentException - Either offset or length is less than 0 or greater than s 's length, or s ' s length minus offset is less than length.
        java.lang.NullPointerException - The parameter s is null.
      • BuildIRI

        public static java.lang.String BuildIRI​(java.lang.String schemeAndAuthority,
                                                java.lang.String path,
                                                java.lang.String query,
                                                java.lang.String fragment)
        Builds an internationalized resource identifier (IRI) from its components.
        Parameters:
        schemeAndAuthority - string representing a scheme component, an authority component, or both. Examples of this parameter include "example://example", "example:", and "//example", but not "example". Can be null or empty.
        path - A string representing a path component. Can be null or empty.
        query - The query string. Can be null or empty.
        fragment - The fragment identifier. Can be null or empty.
        Returns:
        A URI built from the given components.
        Throws:
        java.lang.IllegalArgumentException - Invalid schemeAndAuthority parameter, or the arguments result in an invalid IRI.
      • IsValidIRI

        public static boolean IsValidIRI​(java.lang.String s)
        Returns whether a string is a valid IRI according to the IRIStrict parse mode.
        Parameters:
        s - A text string. Can be null.
        Returns:
        True if the string is not null and is a valid IRI; otherwise, false.
      • IsValidIRI

        public static boolean IsValidIRI​(java.lang.String s,
                                         URIUtility.ParseMode mode)
        Returns whether a string is a valid IRI according to the given parse mode.
        Parameters:
        s - A text string. Can be null.
        mode - The parse mode to use when checking for a valid IRI.
        Returns:
        True if the string is not null and is a valid IRI; otherwise, false.
      • RelativeResolve

        public static java.lang.String RelativeResolve​(java.lang.String refValue,
                                                       java.lang.String absoluteBase)
        Resolves a URI or IRI relative to another URI or IRI.
        Parameters:
        refValue - A string representing a URI or IRI reference. Example: dir/file.txt.
        absoluteBase - A string representing an absolute URI or IRI reference. Can be null. Example: http://example.com/my/path/.
        Returns:
        The resolved IRI, or null if refValue is null or is not a valid IRI. If absoluteBase is null or is not a valid IRI, returns refValue. Example: http://example.com/my/path/dir/file.txt.
      • RelativeResolve

        public static java.lang.String RelativeResolve​(java.lang.String refValue,
                                                       java.lang.String absoluteBase,
                                                       URIUtility.ParseMode parseMode)
        Resolves a URI or IRI relative to another URI or IRI.
        Parameters:
        refValue - A string representing a URI or IRI reference. Example: dir/file.txt. Can be null.
        absoluteBase - A string representing an absolute URI or IRI reference. Can be null. Example: http://example.com/my/path/.
        parseMode - Parse mode that specifies whether certain characters are allowed when parsing IRIs and URIs.
        Returns:
        The resolved IRI, or null if refValue is null or is not a valid IRI. If absoluteBase is null or is not a valid IRI, returns refValue.
        Throws:
        java.lang.NullPointerException - The parameter refValue or absoluteBase or refValue or refValue is null.
      • SplitIRIToStrings

        public static java.lang.String[] SplitIRIToStrings​(java.lang.String s)
        Parses an Internationalized Resource Identifier (IRI) reference under RFC3987. If the IRI reference is syntactically valid, splits the string into its components and returns an array containing those components.
        Parameters:
        s - A string that contains an IRI. Can be null.
        Returns:
        If the string is a valid IRI reference, returns an array of five strings. Each of the five pairs corresponds to the IRI's scheme, authority, path, query, or fragment identifier, respectively. If a component is absent, the corresponding element will be null. If the string is null or is not a valid IRI, returns null.
      • SplitIRI

        public static int[] SplitIRI​(java.lang.String s)
        Parses an Internationalized Resource Identifier (IRI) reference under RFC3987. If the IRI reference is syntactically valid, splits the string into its components and returns an array containing the indices into the components.
        Parameters:
        s - A string that contains an IRI. Can be null.
        Returns:
        If the string is a valid IRI reference, returns an array of 10 integers. Each of the five pairs corresponds to the start and end index of the IRI's scheme, authority, path, query, or fragment identifier, respectively. The scheme, authority, query, and fragment identifier, if present, will each be given without the ending colon, the starting "//", the starting "?", and the starting "#", respectively. If a component is absent, both indices in that pair will be -1. If the string is null or is not a valid IRI, returns null.
      • SplitIRI

        public static int[] SplitIRI​(java.lang.String s,
                                     int offset,
                                     int length,
                                     URIUtility.ParseMode parseMode)
        Parses a substring that represents an Internationalized Resource Identifier (IRI) under RFC3987. If the IRI is syntactically valid, splits the string into its components and returns an array containing the indices into the components.
        Parameters:
        s - A string that contains an IRI. Can be null.
        offset - A Index starting at 0 showing where the desired portion of "s" begins.
        length - The length of the desired portion of "s" (but not more than "s" 's length).
        parseMode - Parse mode that specifies whether certain characters are allowed when parsing IRIs and URIs.
        Returns:
        If the string is a valid IRI, returns an array of 10 integers. Each of the five pairs corresponds to the start and end index of the IRI's scheme, authority, path, query, or fragment component, respectively. The scheme, authority, query, and fragment components, if present, will each be given without the ending colon, the starting "//", the starting "?", and the starting "#", respectively. If a component is absent, both indices in that pair will be -1 (an index won't be less than 0 in any other case). If the string is null or is not a valid IRI, returns null.
        Throws:
        java.lang.IllegalArgumentException - Either offset or length is less than 0 or greater than s 's length, or s ' s length minus offset is less than length.
        java.lang.NullPointerException - The parameter s is null.
      • SplitIRI

        public static int[] SplitIRI​(java.lang.String s,
                                     URIUtility.ParseMode parseMode)
        Parses an Internationalized Resource Identifier (IRI) reference under RFC3987. If the IRI is syntactically valid, splits the string into its components and returns an array containing the indices into the components.
        Parameters:
        s - A string representing an IRI. Can be null.
        parseMode - The parameter parseMode is a ParseMode object.
        Returns:
        If the string is a valid IRI reference, returns an array of 10 integers. Each of the five pairs corresponds to the start and end index of the IRI's scheme, authority, path, query, or fragment identifier, respectively. The scheme, authority, query, and fragment identifier, if present, will each be given without the ending colon, the starting "//", the starting "?", and the starting "#", respectively. If a component is absent, both indices in that pair will be -1. If the string is null or is not a valid IRI, returns null.
      • DirectoryPath

        public static java.lang.String DirectoryPath​(java.lang.String uref)
        Extracts the scheme, the authority, and the path component (up to and including the last "/" in the path if any) from the given URI or IRI, using the IRIStrict parse mode to check the URI or IRI. Any "./" or "../" in the path is not condensed.
        Parameters:
        uref - A text string representing a URI or IRI. Can be null.
        Returns:
        The directory path of the URI or IRI. Returns null if uref is null or not a valid URI or IRI.
        Throws:
        java.lang.NullPointerException - The parameter uref is null.
      • DirectoryPath

        public static java.lang.String DirectoryPath​(java.lang.String uref,
                                                     URIUtility.ParseMode parseMode)
        Extracts the scheme, the authority, and the path component (up to and including the last "/" in the path if any) from the given URI or IRI, using the given parse mode to check the URI or IRI. Any "./" or "../" in the path is not condensed.
        Parameters:
        uref - A text string representing a URI or IRI. Can be null.
        parseMode - The parse mode to use to check the URI or IRI.
        Returns:
        The directory path of the URI or IRI. Returns null if uref is null or not a valid URI or IRI.
      • RelativeResolveWithinBaseURI

        public static java.lang.String RelativeResolveWithinBaseURI​(java.lang.String refValue,
                                                                    java.lang.String absoluteBase)
        Resolves a URI or IRI relative to another URI or IRI, but only if the resolved URI has no "." or ".." component in its path and only if resolved URI's directory path matches that of the second URI or IRI.
        Parameters:
        refValue - A string representing a URI or IRI reference. Example: dir/file.txt.
        absoluteBase - A string representing an absolute URI reference. Example: http://example.com/my/path/.
        Returns:
        The resolved IRI, or null if refValue is null or is not a valid IRI, or refValue if absoluteBase is null or an empty string, or null if absoluteBase is neither null nor empty and is not a valid IRI. Returns null instead if the resolved IRI has no "." or ".." component in its path or if the resolved URI's directory path does not match that of absoluteBase. Example: http://example.com/my/path/dir/file.txt.