Class SeedUrlConfiguration

    • Method Detail

      • hasSeedUrls

        public final boolean hasSeedUrls()
        For responses, this returns true if the service returned a value for the SeedUrls property. This DOES NOT check that the value is non-empty (for which, you should check the isEmpty() method on the property). This is useful because the SDK will never return a null collection or map, but you may need to differentiate between the service returning nothing (or null) and the service returning an empty collection or map. For requests, this returns true if a value for the property was specified in the request builder, and false if a value was not specified.
      • seedUrls

        public final List<String> seedUrls()

        The list of seed or starting point URLs of the websites you want to crawl.

        The list can include a maximum of 100 seed URLs.

        Attempts to modify the collection returned by this method will result in an UnsupportedOperationException.

        This method will never return null. If you would like to know whether the service returned this field (so that you can differentiate between null and empty), you can use the hasSeedUrls() method.

        Returns:
        The list of seed or starting point URLs of the websites you want to crawl.

        The list can include a maximum of 100 seed URLs.

      • webCrawlerMode

        public final WebCrawlerMode webCrawlerMode()

        You can choose one of the following modes:

        • HOST_ONLY—crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.

        • SUBDOMAINS—crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.

        • EVERYTHING—crawl the website host names with subdomains and other domains that the web pages link to.

        The default mode is set to HOST_ONLY.

        If the service returns an enum value that is not available in the current SDK version, webCrawlerMode will return WebCrawlerMode.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available from webCrawlerModeAsString().

        Returns:
        You can choose one of the following modes:

        • HOST_ONLY—crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.

        • SUBDOMAINS—crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.

        • EVERYTHING—crawl the website host names with subdomains and other domains that the web pages link to.

        The default mode is set to HOST_ONLY.

        See Also:
        WebCrawlerMode
      • webCrawlerModeAsString

        public final String webCrawlerModeAsString()

        You can choose one of the following modes:

        • HOST_ONLY—crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.

        • SUBDOMAINS—crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.

        • EVERYTHING—crawl the website host names with subdomains and other domains that the web pages link to.

        The default mode is set to HOST_ONLY.

        If the service returns an enum value that is not available in the current SDK version, webCrawlerMode will return WebCrawlerMode.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available from webCrawlerModeAsString().

        Returns:
        You can choose one of the following modes:

        • HOST_ONLY—crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.

        • SUBDOMAINS—crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.

        • EVERYTHING—crawl the website host names with subdomains and other domains that the web pages link to.

        The default mode is set to HOST_ONLY.

        See Also:
        WebCrawlerMode
      • hashCode

        public final int hashCode()
        Overrides:
        hashCode in class Object
      • equals

        public final boolean equals​(Object obj)
        Overrides:
        equals in class Object
      • toString

        public final String toString()
        Returns a string representation of this object. This is useful for testing and debugging. Sensitive data will be redacted from this string using a placeholder value.
        Overrides:
        toString in class Object
      • getValueForField

        public final <T> Optional<T> getValueForField​(String fieldName,
                                                      Class<T> clazz)