Class AggregationSpout
java.lang.Object
org.apache.storm.topology.base.BaseComponent
org.apache.storm.topology.base.BaseRichSpout
com.digitalpebble.stormcrawler.persistence.AbstractQueryingSpout
com.digitalpebble.stormcrawler.opensearch.persistence.AbstractSpout
com.digitalpebble.stormcrawler.opensearch.persistence.AggregationSpout
- All Implemented Interfaces:
Serializable,org.apache.storm.spout.ISpout,org.apache.storm.topology.IComponent,org.apache.storm.topology.IRichSpout,org.opensearch.action.ActionListener<org.opensearch.action.search.SearchResponse>
- Direct Known Subclasses:
HybridSpout
public class AggregationSpout
extends AbstractSpout
implements org.opensearch.action.ActionListener<org.opensearch.action.search.SearchResponse>
Spout which pulls URL from an ES index. Use a single instance unless you use 'es.status.routing'
with the StatusUpdaterBolt, in which case you need to have exactly the same number of spout
instances as ES shards. Guarantees a good mix of URLs by aggregating them by an arbitrary field
e.g. key.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class com.digitalpebble.stormcrawler.persistence.AbstractQueryingSpout
AbstractQueryingSpout.InProcessMap<K extends Object,V extends Object> -
Field Summary
FieldsFields inherited from class com.digitalpebble.stormcrawler.opensearch.persistence.AbstractSpout
bucketSortField, client, filterQueries, indexName, logIdprefix, maxBucketNum, maxURLsPerBucket, OSBoltType, OSStatusBucketFieldParamName, OSStatusBucketSortFieldParamName, OSStatusFilterParamName, OSStatusGlobalSortFieldParamName, OSStatusIndexNameParamName, OSStatusMaxBucketParamName, OSStatusMaxURLsParamName, OSStatusQueryTimeoutParamName, partitionField, queryDate, queryTimeout, shardID, totalSortFieldFields inherited from class com.digitalpebble.stormcrawler.persistence.AbstractQueryingSpout
_collector, beingProcessed, buffer, eventCounter, isInQuery, lastTimeResetToNOW, maxDelayBetweenQueries, minDelayBetweenQueries, queryTimes, resetFetchDateAfterNSecs, resetFetchDateParamName, StatusMaxDelayParamName, StatusMinDelayParamName, StatusTTLPurgatory -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidvoidonResponse(org.opensearch.action.search.SearchResponse response) voidopen(Map<String, Object> stormConf, org.apache.storm.task.TopologyContext context, org.apache.storm.spout.SpoutOutputCollector collector) protected voidBuilds a query and use it retrieve the results from OS *protected voidsortValuesForKey(String key, Object[] sortValues) Methods inherited from class com.digitalpebble.stormcrawler.opensearch.persistence.AbstractSpout
ack, addHitToBuffer, close, fail, fromKeyValuesMethods inherited from class com.digitalpebble.stormcrawler.persistence.AbstractQueryingSpout
activate, deactivate, declareOutputFields, getTimeLastQuerySent, markQueryReceivedNow, nextTupleMethods inherited from class org.apache.storm.topology.base.BaseComponent
getComponentConfigurationMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.storm.topology.IComponent
getComponentConfiguration
-
Field Details
-
currentBuckets
-
-
Constructor Details
-
AggregationSpout
public AggregationSpout()
-
-
Method Details
-
open
public void open(Map<String, Object> stormConf, org.apache.storm.task.TopologyContext context, org.apache.storm.spout.SpoutOutputCollector collector) - Specified by:
openin interfaceorg.apache.storm.spout.ISpout- Overrides:
openin classAbstractSpout
-
populateBuffer
protected void populateBuffer()Description copied from class:AbstractSpoutBuilds a query and use it retrieve the results from OS *- Specified by:
populateBufferin classAbstractSpout
-
onFailure
- Specified by:
onFailurein interfaceorg.opensearch.action.ActionListener<org.opensearch.action.search.SearchResponse>
-
onResponse
public void onResponse(org.opensearch.action.search.SearchResponse response) - Specified by:
onResponsein interfaceorg.opensearch.action.ActionListener<org.opensearch.action.search.SearchResponse>
-
sortValuesForKey
-