Interface UrlAware
-
- All Implemented Interfaces:
public interface UrlAwareUrlAwareencapsulates a URL along with additional specifications defining its loading behavior.A URL represents a Uniform Resource Locator, a pointer to a "resource" on the World Wide Web. A resource can be something as simple as a file or a directory, or it can be a reference to a more complicated object, such as a query to a database or to a search engine.
In java, a URL object represents a URL. In PulsarRPA, a UrlAware object represents a URL with extra information telling the system how to fetch it.
-
-
Method Summary
Modifier and Type Method Description abstract StringgetUrl()The url specification, can be followed by load arguments. abstract UnitsetUrl(String url)The url specification, can be followed by load arguments. abstract StringgetArgs()The explicitly specified load arguments abstract UnitsetArgs(String args)The explicitly specified load arguments abstract StringgetHref()The hypertext reference, it defines the address of the document, which this time is linked from. abstract UnitsetHref(String href)The hypertext reference, it defines the address of the document, which this time is linked from. abstract StringgetReferrer()The referrer url, it is the url of the webpage that contains the hyperlink. abstract UnitsetReferrer(String referrer)The referrer url, it is the url of the webpage that contains the hyperlink. abstract IntegergetPriority()Represents the priority of a task, determining the order of execution. abstract UnitsetPriority(Integer priority)Represents the priority of a task, determining the order of execution. abstract StringgetConfiguredUrl()The configured url, always be "$url $args" abstract BooleanisStandard()If true, the url is standard and can be converted to a java.net.URL abstract URLgetToURL()Converted to a java.net.URL abstract URLgetToURLOrNull()Converted to a java.net.URL, if the url is invalid, return null abstract BooleanisNil()An url is Nil if it equals to AppConstants. abstract BooleanisPersistable()If true, the url is persistable, it can be saved to the database. abstract StringgetText()The text of the url, it can be the text of the hyperlink. abstract UnitsetText(String text)The text of the url, it can be the text of the hyperlink. abstract IntegergetOrder()The order of the url. abstract UnitsetOrder(Integer order)The order of the url. abstract StringgetLabel()The url label, it should be a shortcut for -labeloption in load optionsabstract InstantgetDeadline()The deadline, it should be a shortcut for -deadlineoption in load optionsabstract StringgetLang()Required website language, reserved for future use abstract StringgetCountry()Required website country, reserved for future use abstract StringgetDistrict()Required website district, reserved for future use abstract IntegergetNMaxRetry()The maximum retry times abstract IntegergetDepth()The depth of the url from the root url. -
-
Method Detail
-
getHref
abstract String getHref()
The hypertext reference, it defines the address of the document, which this time is linked from. The href is usually extracted from the webpage and serves as the browser's primary choice for navigation.
-
setHref
abstract Unit setHref(String href)
The hypertext reference, it defines the address of the document, which this time is linked from. The href is usually extracted from the webpage and serves as the browser's primary choice for navigation.
-
getReferrer
abstract String getReferrer()
The referrer url, it is the url of the webpage that contains the hyperlink.
-
setReferrer
abstract Unit setReferrer(String referrer)
The referrer url, it is the url of the webpage that contains the hyperlink.
-
getPriority
abstract Integer getPriority()
Represents the priority of a task, determining the order of execution.
The priority value is an integer where a smaller value indicates a higher priority. This is consistent with java.util.concurrent.PriorityBlockingQueue.
If the priority value is not within the range defined by Priority13, it will be adjusted to the nearest valid value. For example, a priority of -2001 will be adjusted to Priority13.HIGHER2.
Note: The priority specified in args or LoadOptions takes precedence over the priority in UrlAware, Hyperlink, etc.
Priority can be set in the following ways:
In the url, for example,
http://example.com -priority -2000In the args, for example,
Hyperlink("http://example.com", "", args = "-priority -2000")Int the ai.platon.pulsar.skeleton.common.options.LoadOptions object, for example,
session.load("http://example.com", options.apply { priority = -2000 })In the UrlAware object, for example,
Hyperlink("http://example.com", "", priority = -2000)
If a url is normalized like this:
session.normalize(url: UrlAware, options: LoadOptions)The priority will be set in the following order:
The priority in the url
The priority in the args
The priority in the options
Note: Consider use url args to set priority only.
-
setPriority
abstract Unit setPriority(Integer priority)
Represents the priority of a task, determining the order of execution.
The priority value is an integer where a smaller value indicates a higher priority. This is consistent with java.util.concurrent.PriorityBlockingQueue.
If the priority value is not within the range defined by Priority13, it will be adjusted to the nearest valid value. For example, a priority of -2001 will be adjusted to Priority13.HIGHER2.
Note: The priority specified in args or LoadOptions takes precedence over the priority in UrlAware, Hyperlink, etc.
Priority can be set in the following ways:
In the url, for example,
http://example.com -priority -2000In the args, for example,
Hyperlink("http://example.com", "", args = "-priority -2000")Int the ai.platon.pulsar.skeleton.common.options.LoadOptions object, for example,
session.load("http://example.com", options.apply { priority = -2000 })In the UrlAware object, for example,
Hyperlink("http://example.com", "", priority = -2000)
If a url is normalized like this:
session.normalize(url: UrlAware, options: LoadOptions)The priority will be set in the following order:
The priority in the url
The priority in the args
The priority in the options
Note: Consider use url args to set priority only.
-
getConfiguredUrl
abstract String getConfiguredUrl()
The configured url, always be "$url $args"
-
isStandard
abstract Boolean isStandard()
If true, the url is standard and can be converted to a java.net.URL
-
getToURL
abstract URL getToURL()
Converted to a java.net.URL
-
getToURLOrNull
abstract URL getToURLOrNull()
Converted to a java.net.URL, if the url is invalid, return null
-
isPersistable
abstract Boolean isPersistable()
If true, the url is persistable, it can be saved to the database. Not all urls are persistable, for example, a ListenableHyperlink with events is not persistable.
-
setText
abstract Unit setText(String text)
The text of the url, it can be the text of the hyperlink.
-
getLabel
abstract String getLabel()
The url label, it should be a shortcut for
-labeloption in load options
-
getDeadline
abstract Instant getDeadline()
The deadline, it should be a shortcut for
-deadlineoption in load options
-
getCountry
abstract String getCountry()
Required website country, reserved for future use
-
getDistrict
abstract String getDistrict()
Required website district, reserved for future use
-
getNMaxRetry
abstract Integer getNMaxRetry()
The maximum retry times
-
-
-
-