Package ai.platon.pulsar.common.collect
See: Description
-
Interface Summary Interface Description CrawlableFatLinkCollector ExternalUrlLoader A url loader loads urls from external sources, such as files, databases. Loadable UrlCache The url cache holds urls. UrlPool A UrlPool contains many UrlCaches, the urls added to the pool will be processed in crawl loops. -
Class Summary Class Description ChainedDataCollector PauseDataCollector DelayCacheCollector UrlTopic A url topic groups urls. UrlTopicComparator AbstractExternalUrlLoader DelayExternalUrlLoader OneLoadExternalUrlLoader LoadingIterator ConcurrentLoadingIterable AbstractUrlCache ConcurrentUrlCache LoadingUrlCache Contains a sets of loading queues which can load urls from external source using urlLoader. UrlFeeder The url feeder collects urls from the url pool and feed them to the crawlers. UrlFeederHelper This is a helper class that queries or inserts a data collector from or into a URLFeeder. LocalFileUrlLoader TemporaryLocalFileUrlLoader DelayUrl The delay url. AbstractUrlPool The abstract url pool ConcurrentUrlPool The concurrent url pool LoadingUrlPool A LoadingUrlPool is a UrlPool, the items can be loaded from external source using loader.