Package ai.platon.pulsar.common.collect
Class UrlFeeder
-
- All Implemented Interfaces:
-
kotlin.collections.Iterable
public final class UrlFeeder implements Iterable<UrlAware>
The url feeder collects urls from the url pool and feed them to the crawlers.
The url feed collect urls using DataCollector, each DataCollector collect urls from exactly one UrlCache.
The user can register multiple UrlCaches and DataCollectors for different type of tasks.
-
-
Field Summary
Fields Modifier and Type Field Description private final ConcurrentLoadingIterable<UrlAware>loadingIterableprivate final IntegercacheSizeprivate final Collection<PriorityDataCollector<UrlAware>>openCollectorsprivate final List<PriorityDataCollector<UrlAware>>collectorsprivate final Stringabstractprivate final Stringreportprivate final UrlPoolurlPoolprivate final IntegerlowerCacheSizeprivate final BooleanenableDefaults
-
Method Summary
-
-
Method Detail
-
getLoadingIterable
final ConcurrentLoadingIterable<UrlAware> getLoadingIterable()
-
getCacheSize
final Integer getCacheSize()
-
getOpenCollectors
final Collection<PriorityDataCollector<UrlAware>> getOpenCollectors()
-
getCollectors
final List<PriorityDataCollector<UrlAware>> getCollectors()
-
getAbstract
final String getAbstract()
-
getUrlPool
final UrlPool getUrlPool()
-
getLowerCacheSize
final Integer getLowerCacheSize()
-
getEnableDefaults
final Boolean getEnableDefaults()
-
addFirst
final Unit addFirst(UrlAware url)
Add a hyperlink to the very beginning of the fetch queue, so it will be served first
-
addLast
final Unit addLast(UrlAware url)
Add a hyperlink to the end of the fetch queue, so it will be served last
-
estimatedOrder
final Integer estimatedOrder(Integer priority)
Estimate the order to fetch for the next task to add with priority priority.
-
addDefaultCollectors
final UrlFeeder addDefaultCollectors()
-
addCollector
final UrlFeeder addCollector(PriorityDataCollector<UrlAware> collector)
-
addCollectors
final UrlFeeder addCollectors(Iterable<PriorityDataCollector<UrlAware>> collectors)
-
findByName
final List<PriorityDataCollector<UrlAware>> findByName(String name)
-
findByName
final List<PriorityDataCollector<UrlAware>> findByName(Iterable<String> names)
-
findByName
final List<PriorityDataCollector<UrlAware>> findByName(Regex regex)
-
findByNameLike
final List<PriorityDataCollector<UrlAware>> findByNameLike(String name)
-
remove
final Boolean remove(DataCollector<UrlAware> collector)
-
removeAll
final Boolean removeAll(Collection<DataCollector<UrlAware>> collectors)
-
-
-
-