Object UrlUtils
-
- All Implemented Interfaces:
public class UrlUtils
-
-
Field Summary
Fields Modifier and Type Field Description private final List<String>INTERNAL_URL_PREFIXESprivate final List<String>INTERNAL_URLSpublic final static UrlUtilsINSTANCE
-
Method Summary
Modifier and Type Method Description final static BooleanisInternal(String url)Test if the url is an internal URL. final static BooleanisNotInternal(String url)Test if the url is not an internal URL. final static BooleanisLocalFile(String url)Check if the given url is a local file url, which is a url that starts with {@link AppConstants#LOCAL_FILE_SERVE_PREFIX} final static StringpathToLocalURL(Path path)Convert a path to a URL, the path will be encoded to base64 and appended to the {@link AppConstants#LOCAL_FILE_FAKE_SERVER_HOME}For example: C:\Users\pereg\AppData\Local\Temp\pulsar\test.txtwill be converted to:http://localfile.org?path=QzpcVXNlcnNccGVyZWdcQXBwRGF0YVxMb2NhbFxUZW1wXHB1bHNhclx0ZXN0LnR4dA==final static PathlocalURLToPath(String url)Convert a URL to a path, the path is decoded from base64 and the prefix {@link AppConstants#LOCAL_FILE_SERVE_PREFIX} is removed final static BooleanisBrowserURL(String str)Checks if the given string is a browser-specific url. final static BooleanisMappedBrowserURL(String url)Checks if the given URL is a browser-specific URL by verifying if it starts with a predefined prefix. final static StringbrowserURLToStandardURL(String url)Converts a browser url string into a complete URL. final static StringstandardURLToBrowserURL(String url)Extracts the browser url from a given URL and re-encodes it. final static URLgetURLOrNull(String spec)Creates a {@code URL} object from the {@code String} representation. final static BooleanisStandard(String str)Test if the str is a standard URL. final static BooleanisAllowed(String str)Test if the str is an allowed URL. final static URLnormalize(String url, Boolean ignoreQuery)Normalize a url spec. final static StringnormalizeOrEmpty(String url, Boolean ignoreQuery)Normalize a url spec. final static StringnormalizeOrNull(String url, Boolean ignoreQuery)Normalize a url spec. final static List<String>normalizeUrls(Iterable<String> urls, Boolean ignoreQuery)Normalize a url spec. final Map<String, String>splitQueryParameters(String url)Split the query parameters of a url. final StringgetQueryParameters(String url, String parameterName)Get the query parameter of a url. final StringremoveQueryParameters(String url, String parameterNames)Remove the query parameters of a url. final StringkeepQueryParameters(String url, String parameterNames)Keep the query parameters of a url, and remove the others. final static URLresolveURL(URL base, String targetUrl)Resolve relative URL-s and fix a java.net.URL error in handling of URLs with pure query targets. final static Pair<String, String>splitUrlArgs(String configuredUrl)Split url and args final static StringmergeUrlArgs(String url, String args)Merge url and args final static StringgetUrlWithoutParameters(String url)Get the url without parameters final static Pair<String, String>normalizedUrlAndKey(String originalUrl, Boolean norm)Returns the normalized url and key final static StringreverseUrl(String url)Reverses a url's domain. final static StringreverseUrl(URL url)Reverses a url's domain. final static StringreverseUrl(Integer tenantId, String unreversedUrl)Get the reversed and tenanted format of unreversedUrl, unreversedUrl can be both tenanted or not tenanted This method might change the tenant id of the original urlZero tenant id means no tenant final static StringreverseUrlOrEmpty(String url)Reverses a url's domain. final static StringreverseUrlOrNull(String url)Reverses a url's domain. final static StringunreverseUrl(String reversedUrl)Get the unreversed url of a reversed url. final static StringunreverseUrl(Integer tenantId, String reversedUrl)Get unreversed and tenanted url of reversedUrl, reversedUrl can be both tenanted or not tenanted, This method might change the tenant id of the original url final static StringunreverseUrlOrNull(String reversedUrl)Get the unreversed url of a reversed url. final static StringgetStartKey(Integer tenantId, String unreversedUrl)Get start key for tenanted table final static StringgetStartKey(String unreversedUrl)Get start key for non-tenanted table final static StringgetEndKey(String unreversedUrl)Get end key for non-tenanted tables final static StringgetEndKey(Integer tenantId, String unreversedUrl)Get end key for tenanted tables final static StringdecodeKeyLowerBound(String startKey)We use unicode character \u0001 to be the lower key bound, but the client usally encode the character to be a string "\\u0001" or "\\\\u0001", so we should decode them to be the right oneNote, the character is displayed as <U></U>+0001> in some output systemNow, we consider all the three character/string \u0001, "\\u0001", "\\\\u0001" are the lower key bound final static StringdecodeKeyUpperBound(String endKey)We use unicode character \uFFFF to be the upper key bound, but the client usally encode the character to be a string "\\uFFFF" or "\\\\uFFFF", so we should decode them to be the right oneNote, the character may display as <U></U>+FFFF> in some output systemNow, we consider all the three character/string \uFFFF, "\\uFFFF", "\\\\uFFFF" are the upper key bound final static StringgetReversedHost(String reversedUrl)Given a reversed url, returns the reversed host E.g "com.foo.bar:http:8983/to/index.html? final static StringreverseHost(String hostName)Reverse the host name. final static StringunreverseHost(String reversedHostName)Unreverse the host name. final BooleanisPublicSuffix(String domain)Indicates whether this domain name represents a public suffix, as defined by the Mozilla Foundation's Public Suffix List (PSL). final StringgetPublicSuffix(String url)Get the host's public suffix. final StringgetPublicSuffix(URL url)Get the host's public suffix. final BooleanisTopPrivateDomain(URL url)Indicates whether this domain name is composed of exactly one subdomain component followed by a {@linkplain #isPublicSuffix() public suffix}. final StringgetTopPrivateDomain(URL url)Returns the portion of this domain name that is one level beneath the isPublicSuffix public suffix. final StringgetTopPrivateDomain(String url)Returns the portion of this domain name that is one level beneath the isPublicSuffix public suffix. final StringgetTopPrivateDomainOrNull(String url)Returns the portion of this domain name that is one level beneath the isPublicSuffix public suffix. final StringgetOrigin(String url)Returns the lowercase origin for the url. final StringgetOriginOrNull(String url)Returns the lowercase origin for the url or null if the url is not well-formed. final StringgetHostName(String url)Returns the lowercase hostname for the url. final StringgetHostName(String url, String defaultValue)final StringgetHostNameOrNull(String url)Returns the lowercase hostname for the url or null if the url is not well-formed. final List<String>getINTERNAL_URL_PREFIXES()The prefix of allowed urls final List<String>getINTERNAL_URLS()The urls of all allowed internal urls -
-
Method Detail
-
isInternal
final static Boolean isInternal(String url)
Test if the url is an internal URL. Internal URLs are URLs that are used to identify internal resources and will never be fetched from the internet.
- Parameters:
url- The url to test- Returns:
true if the given str is an internal URL, false otherwise
-
isNotInternal
final static Boolean isNotInternal(String url)
Test if the url is not an internal URL. Internal URLs are URLs that are used to identify internal resources and will never be fetched from the internet.
- Parameters:
url- The url to test- Returns:
true if the given str is not an internal URL, false otherwise
-
isLocalFile
final static Boolean isLocalFile(String url)
Check if the given url is a local file url, which is a url that starts with {@link AppConstants#LOCAL_FILE_SERVE_PREFIX}
-
pathToLocalURL
final static String pathToLocalURL(Path path)
Convert a path to a URL, the path will be encoded to base64 and appended to the {@link AppConstants#LOCAL_FILE_FAKE_SERVER_HOME}
For example:
C:\Users\pereg\AppData\Local\Temp\pulsar\test.txtwill be converted to:http://localfile.org?path=QzpcVXNlcnNccGVyZWdcQXBwRGF0YVxMb2NhbFxUZW1wXHB1bHNhclx0ZXN0LnR4dA==- Parameters:
path- The path to convertTODO: consider just use path.
-
localURLToPath
final static Path localURLToPath(String url)
Convert a URL to a path, the path is decoded from base64 and the prefix {@link AppConstants#LOCAL_FILE_SERVE_PREFIX} is removed
-
isBrowserURL
final static Boolean isBrowserURL(String str)
Checks if the given string is a browser-specific url.
This function determines whether the string is a browser-specific url by checking if it exists in the internal URL list (INTERNAL_URLS), or if it starts with any of the internal URL prefixes (INTERNAL_URL_PREFIXES).
- Parameters:
str- The string to be checked.- Returns:
Returns true if the string is a browser-specific url; otherwise, returns false.
-
isMappedBrowserURL
final static Boolean isMappedBrowserURL(String url)
Checks if the given URL is a browser-specific URL by verifying if it starts with a predefined prefix.
- Parameters:
url- The URL to check.- Returns:
Returns true if the URL starts with the browser-specific url prefix, otherwise false.
-
browserURLToStandardURL
final static String browserURLToStandardURL(String url)
Converts a browser url string into a complete URL. The function URL-encodes the url string and appends it to a predefined prefix to form the final URL.
- Parameters:
url- The browser url string to be converted.- Returns:
Returns the complete URL string containing the prefix and the encoded url parameter.
-
standardURLToBrowserURL
final static String standardURLToBrowserURL(String url)
Extracts the browser url from a given URL and re-encodes it. The function retrieves the url parameter from the URL, re-encodes it, and reconstructs the URL.
- Parameters:
url- The URL containing the browser url.- Returns:
Returns the reconstructed URL with the re-encoded url parameter.
-
getURLOrNull
final static URL getURLOrNull(String spec)
Creates a {@code URL} object from the {@code String} representation.
- Parameters:
spec- the {@code String} to parse as a URL.- Returns:
the URL parsed from spec, or null if no protocol is specified, or an unknown protocol is found, or {@code spec} is {@code null}, or the parsed URL fails to comply with the specific syntax of the associated protocol.
-
isStandard
final static Boolean isStandard(String str)
Test if the str is a standard URL.
- Parameters:
str- The string to test- Returns:
true if the given str is a standard URL, false otherwise
-
isAllowed
final static Boolean isAllowed(String str)
Test if the str is an allowed URL.
- Parameters:
str- The string to test- Returns:
true if the given str is a standard URL, false otherwise
-
normalize
final static URL normalize(String url, Boolean ignoreQuery)
Normalize a url spec.
A URL may have appended to it a "fragment", also known as a "ref" or a "reference". The fragment is indicated by the sharp sign character "#" followed by more characters. For example: http://java.sun.com/index.html#chapter1
The fragment will be removed after the normalization. If ignoreQuery is true, the query string will be removed.
- Parameters:
url-The url to normalize, a tailing argument list is allowed and will be removedignoreQuery-If true, the result url does not contain a query string- Returns:
The normalized URL
-
normalizeOrEmpty
final static String normalizeOrEmpty(String url, Boolean ignoreQuery)
Normalize a url spec.
A URL may have appended to it a "fragment", also known as a "ref" or a "reference". The fragment is indicated by the sharp sign character "#" followed by more characters. For example: http://java.sun.com/index.html#chapter1
The fragment will be removed after the normalization. If ignoreQuery is true, the query string will be removed.
- Parameters:
url-The url to normalize, a tailing argument list is allowed and will be removedignoreQuery-If true, the result url does not contain a query string- Returns:
The normalized url, or an empty string ("") if the given string violates RFC 2396
-
normalizeOrNull
final static String normalizeOrNull(String url, Boolean ignoreQuery)
Normalize a url spec.
A URL may have appended to it a "fragment", also known as a "ref" or a "reference". The fragment is indicated by the sharp sign character "#" followed by more characters. For example: http://java.sun.com/index.html#chapter1
The fragment will be removed after the normalization. If ignoreQuery is true, the query string will be removed.
- Parameters:
url-The url to normalize, a tailing argument list is allowed and will be removedignoreQuery-If true, the result url does not contain a query string- Returns:
The normalized url, or null if the given string violates RFC 2396
-
normalizeUrls
final static List<String> normalizeUrls(Iterable<String> urls, Boolean ignoreQuery)
Normalize a url spec.
A URL may have appended to it a "fragment", also known as a "ref" or a "reference". The fragment is indicated by the sharp sign character "#" followed by more characters. For example: http://java.sun.com/index.html#chapter1
The fragment will be removed after the normalization. If ignoreQuery is true, the query string will be removed.
- Parameters:
urls-The urls to normalize, a tailing argument list is allowed and will be removedignoreQuery-If true, the result url does not contain a query string- Returns:
The normalized URLs
-
splitQueryParameters
final Map<String, String> splitQueryParameters(String url)
Split the query parameters of a url.
- Parameters:
url- The url to split- Returns:
The query parameters of the url
-
getQueryParameters
final String getQueryParameters(String url, String parameterName)
Get the query parameter of a url.
- Parameters:
url- The url to splitparameterName- The name of the query parameter- Returns:
The query parameter of the url
-
removeQueryParameters
final String removeQueryParameters(String url, String parameterNames)
Remove the query parameters of a url.
- Parameters:
url- The url to splitparameterNames- The names of the query parameters- Returns:
The url without the query parameters
-
keepQueryParameters
final String keepQueryParameters(String url, String parameterNames)
Keep the query parameters of a url, and remove the others.
- Parameters:
url- The url to splitparameterNames- The names of the query parameters- Returns:
The url with only the query parameters
-
resolveURL
final static URL resolveURL(URL base, String targetUrl)
Resolve relative URL-s and fix a java.net.URL error in handling of URLs with pure query targets.
- Parameters:
base- base url- Returns:
resolved absolute url.
-
splitUrlArgs
final static Pair<String, String> splitUrlArgs(String configuredUrl)
Split url and args
- Parameters:
configuredUrl- url and args in$url $argsformat- Returns:
url and args pair
-
mergeUrlArgs
final static String mergeUrlArgs(String url, String args)
Merge url and args
- Parameters:
url- urlargs- args- Returns:
url and args in
$url $argsformat
-
getUrlWithoutParameters
final static String getUrlWithoutParameters(String url)
Get the url without parameters
- Parameters:
url- url- Returns:
url without parameters
-
normalizedUrlAndKey
final static Pair<String, String> normalizedUrlAndKey(String originalUrl, Boolean norm)
Returns the normalized url and key
- Returns:
normalized url and key
-
reverseUrl
final static String reverseUrl(String url)
Reverses a url's domain. This form is better for storing in hbase. Because scans within the same domain are faster.
E.g. "http://bar.foo.com:8983/to/index.html?a=b" becomes "com.foo.bar:8983:http/to/index.html?a=b".
- Parameters:
url- url to be reversed- Returns:
Reversed url
-
reverseUrl
final static String reverseUrl(URL url)
Reverses a url's domain. This form is better for storing in hbase. Because scans within the same domain are faster.
E.g. "http://bar.foo.com:8983/to/index.html?a=b" becomes "com.foo.bar:http:8983/to/index.html?a=b".
- Parameters:
url- url to be reversed- Returns:
Reversed url
-
reverseUrl
final static String reverseUrl(Integer tenantId, String unreversedUrl)
Get the reversed and tenanted format of unreversedUrl, unreversedUrl can be both tenanted or not tenanted This method might change the tenant id of the original url
Zero tenant id means no tenant
- Parameters:
unreversedUrl- the unreversed url, can be both tenanted or not tenanted- Returns:
the tenanted and reversed url of unreversedUrl
-
reverseUrlOrEmpty
final static String reverseUrlOrEmpty(String url)
Reverses a url's domain. This form is better for storing in hbase. Because scans within the same domain are faster.
E.g. "http://bar.foo.com:8983/to/index.html?a=b" becomes "com.foo.bar:8983:http/to/index.html?a=b".
- Parameters:
url- url to be reversed- Returns:
Reversed url or empty string if the url is invalid
-
reverseUrlOrNull
final static String reverseUrlOrNull(String url)
Reverses a url's domain. This form is better for storing in hbase. Because scans within the same domain are faster.
E.g. "http://bar.foo.com:8983/to/index.html?a=b" becomes "com.foo.bar:8983:http/to/index.html?a=b".
- Parameters:
url- url to be reversed- Returns:
Reversed url or null if the url is invalid
-
unreverseUrl
final static String unreverseUrl(String reversedUrl)
Get the unreversed url of a reversed url.
- Returns:
the unreversed url of reversedUrl
-
unreverseUrl
final static String unreverseUrl(Integer tenantId, String reversedUrl)
Get unreversed and tenanted url of reversedUrl, reversedUrl can be both tenanted or not tenanted, This method might change the tenant id of the original url
- Parameters:
tenantId- the expected tenant id of the reversedUrlreversedUrl- the reversed url, can be both tenanted or not tenanted- Returns:
the unreversed url of reversedTenantedUrl
-
unreverseUrlOrNull
final static String unreverseUrlOrNull(String reversedUrl)
Get the unreversed url of a reversed url.
- Returns:
the unreversed url of reversedUrl or null if the url is invalid
-
getStartKey
final static String getStartKey(Integer tenantId, String unreversedUrl)
Get start key for tenanted table
- Parameters:
unreversedUrl- unreversed key, which is the original url- Returns:
reverse and tenanted key
-
getStartKey
final static String getStartKey(String unreversedUrl)
Get start key for non-tenanted table
- Parameters:
unreversedUrl- unreversed key, which is the original url- Returns:
reverse key
-
getEndKey
final static String getEndKey(String unreversedUrl)
Get end key for non-tenanted tables
- Parameters:
unreversedUrl- unreversed key, which is the original url- Returns:
reverse, key bound decoded key
-
getEndKey
final static String getEndKey(Integer tenantId, String unreversedUrl)
Get end key for tenanted tables
- Parameters:
unreversedUrl- unreversed key, which is the original url- Returns:
reverse, tenanted and key bound decoded key
-
decodeKeyLowerBound
final static String decodeKeyLowerBound(String startKey)
We use unicode character \u0001 to be the lower key bound, but the client usally encode the character to be a string "\\u0001" or "\\\\u0001", so we should decode them to be the right one
Note, the character is displayed as <U></U>+0001> in some output system
Now, we consider all the three character/string \u0001, "\\u0001", "\\\\u0001" are the lower key bound
-
decodeKeyUpperBound
final static String decodeKeyUpperBound(String endKey)
We use unicode character \uFFFF to be the upper key bound, but the client usally encode the character to be a string "\\uFFFF" or "\\\\uFFFF", so we should decode them to be the right one
Note, the character may display as <U></U>+FFFF> in some output system
Now, we consider all the three character/string \uFFFF, "\\uFFFF", "\\\\uFFFF" are the upper key bound
-
getReversedHost
final static String getReversedHost(String reversedUrl)
Given a reversed url, returns the reversed host E.g "com.foo.bar:http:8983/to/index.html?a=b" -> "com.foo.bar"
- Parameters:
reversedUrl- Reversed url- Returns:
Reversed host
-
reverseHost
final static String reverseHost(String hostName)
Reverse the host name.
- Parameters:
hostName- host name- Returns:
reversed host name
-
unreverseHost
final static String unreverseHost(String reversedHostName)
Unreverse the host name.
- Parameters:
reversedHostName- reversed host name- Returns:
host name
-
isPublicSuffix
final Boolean isPublicSuffix(String domain)
Indicates whether this domain name represents a public suffix, as defined by the Mozilla Foundation's Public Suffix List (PSL). A public suffix is one under which Internet users can directly register names, such as
com,co.ukorpvt.k12.wy.us. Examples of domain names that are not public suffixes includegoogle.com,foo.co.uk, andmyblog.blogspot.com.Public suffixes are a proper superset of .isRegistrySuffix. The list of public suffixes additionally contains privately owned domain names under which Internet users can register subdomains. An example of a public suffix that is not a registry suffix is
blogspot.com. Note that it is true that all public suffixes have registry suffixes, since domain name registries collectively control all internet domain names.For considerations on whether the public suffix or registry suffix designation is more suitable for your application, see this article.
- Returns:
trueif this domain name appears exactly on the public suffix list
-
getPublicSuffix
final String getPublicSuffix(String url)
Get the host's public suffix. For example, co.uk, com, etc.
- Since:
6.0
-
getPublicSuffix
final String getPublicSuffix(URL url)
Get the host's public suffix. For example, co.uk, com, etc.
-
isTopPrivateDomain
final Boolean isTopPrivateDomain(URL url)
Indicates whether this domain name is composed of exactly one subdomain component followed by a {@linkplain #isPublicSuffix() public suffix}. For example, returns {@code true} for {@code google.com} {@code foo.co.uk}, and {@code myblog.blogspot.com}, but not for {@code www.google.com}, {@code co.uk}, or {@code blogspot.com}.
<p>This method can be used to determine whether a domain is probably the highest level for which cookies may be set, though even that depends on individual browsers' implementations of cookie controls. See <a href="http://www.ietf.org/rfc/rfc2109.txt">RFC 2109</a> for details.
-
getTopPrivateDomain
final String getTopPrivateDomain(URL url)
Returns the portion of this domain name that is one level beneath the isPublicSuffix public suffix. For example, for
x.adwords.google.co.ukit returnsgoogle.co.uk, sinceco.ukis a public suffix. Similarly, formyblog.blogspot.comit returns the same domain,myblog.blogspot.com, sinceblogspot.comis a public suffix.If isTopPrivateDomain is true, the current domain name instance is returned.
This method can be used to determine the probable highest level parent domain for which cookies may be set, though even that depends on individual browsers' implementations of cookie controls.
-
getTopPrivateDomain
final String getTopPrivateDomain(String url)
Returns the portion of this domain name that is one level beneath the isPublicSuffix public suffix. For example, for
x.adwords.google.co.ukit returnsgoogle.co.uk, sinceco.ukis a public suffix. Similarly, formyblog.blogspot.comit returns the same domain,myblog.blogspot.com, sinceblogspot.comis a public suffix.If isTopPrivateDomain is true, the current domain name instance is returned.
This method can be used to determine the probable highest level parent domain for which cookies may be set, though even that depends on individual browsers' implementations of cookie controls.
-
getTopPrivateDomainOrNull
final String getTopPrivateDomainOrNull(String url)
Returns the portion of this domain name that is one level beneath the isPublicSuffix public suffix. For example, for
x.adwords.google.co.ukit returnsgoogle.co.uk, sinceco.ukis a public suffix. Similarly, formyblog.blogspot.comit returns the same domain,myblog.blogspot.com, sinceblogspot.comis a public suffix.If isTopPrivateDomain is true, the current domain name instance is returned.
This method can be used to determine the probable highest level parent domain for which cookies may be set, though even that depends on individual browsers' implementations of cookie controls.
-
getOrigin
final String getOrigin(String url)
Returns the lowercase origin for the url.
- Parameters:
url- The url to check.- Returns:
String The hostname for the url.
-
getOriginOrNull
final String getOriginOrNull(String url)
Returns the lowercase origin for the url or null if the url is not well-formed.
- Parameters:
url- The url to check.- Returns:
String The hostname for the url.
-
getHostName
final String getHostName(String url)
Returns the lowercase hostname for the url.
- Parameters:
url- The url to check.- Returns:
String The hostname for the url.
-
getHostName
final String getHostName(String url, String defaultValue)
-
getHostNameOrNull
final String getHostNameOrNull(String url)
Returns the lowercase hostname for the url or null if the url is not well-formed.
- Parameters:
url- The url to check.- Returns:
String The hostname for the url.
-
getINTERNAL_URL_PREFIXES
final List<String> getINTERNAL_URL_PREFIXES()
The prefix of allowed urls
-
getINTERNAL_URLS
final List<String> getINTERNAL_URLS()
The urls of all allowed internal urls
-
-
-
-