public interface RobotsTxt
Use read(java.io.InputStream) to read and parse robots.txt.
| Modifier and Type | Method and Description |
|---|---|
default Grant |
ask(String userAgent,
String path)
Asks for grant.
|
Integer |
getCrawlDelay()
Deprecated.
use
ask(java.lang.String, java.lang.String) to get Grant from which Grant.getCrawlDelay() might be invoked. |
List<String> |
getDisallowList(String userAgent)
Gets a list of disallowed resources.
|
String |
getHost()
Gets host.
|
List<String> |
getSitemaps()
Gets site maps.
|
boolean |
query(String userAgent,
String path)
Checks access to the given HTTP path.
|
static RobotsTxt |
read(InputStream input)
Reads robots.txt available at the URL.
|
boolean query(String userAgent, String path)
userAgent - user agent to be used evaluate authorizationpath - path to accesstrue if there is an access to the requested pathdefault Grant ask(String userAgent, String path)
userAgent - user agent to be used evaluate authorizationpath - path to accessnull)@Deprecated Integer getCrawlDelay()
ask(java.lang.String, java.lang.String) to get Grant from which Grant.getCrawlDelay() might be invoked.0 if no delay declaredString getHost()
null if no host declaredList<String> getDisallowList(String userAgent)
userAgent - user agentstatic RobotsTxt read(InputStream input) throws IOException
input - stream of contentIOException - if unable to read content.Copyright © 2019. All rights reserved.