Class StatusUpdaterBolt
java.lang.Object
org.apache.storm.topology.base.BaseComponent
org.apache.storm.topology.base.BaseRichBolt
com.digitalpebble.stormcrawler.persistence.AbstractStatusUpdaterBolt
com.digitalpebble.stormcrawler.opensearch.persistence.StatusUpdaterBolt
- All Implemented Interfaces:
com.github.benmanes.caffeine.cache.RemovalListener<String,,List<org.apache.storm.tuple.Tuple>> Serializable,org.apache.storm.task.IBolt,org.apache.storm.topology.IComponent,org.apache.storm.topology.IRichBolt,org.opensearch.action.bulk.BulkProcessor.Listener
public class StatusUpdaterBolt
extends AbstractStatusUpdaterBolt
implements com.github.benmanes.caffeine.cache.RemovalListener<String,List<org.apache.storm.tuple.Tuple>>, org.opensearch.action.bulk.BulkProcessor.Listener
Simple bolt which stores the status of URLs into ElasticSearch. Takes the tuples coming from the
'status' stream. To be used in combination with a Spout to read from the index.
- See Also:
-
Field Summary
Fields inherited from class com.digitalpebble.stormcrawler.persistence.AbstractStatusUpdaterBolt
_collector, AS_IS_NEXTFETCHDATE_METADATA, cacheConfigParamName, maxFetchErrorsParamName, roundDateParamName, useCacheParamName -
Constructor Summary
ConstructorsConstructorDescriptionStatusUpdaterBolt(String boltType) Loads the configuration using a substring different from the default value 'status' in order to distinguish it from the spout configurations -
Method Summary
Modifier and TypeMethodDescriptionvoidvoidafterBulk(long executionId, org.opensearch.action.bulk.BulkRequest request, org.opensearch.action.bulk.BulkResponse response) voidbeforeBulk(long executionId, org.opensearch.action.bulk.BulkRequest request) voidcleanup()protected StringMust be overridden for implementing custom index names based on some metadata information By Default, indexName coming from config is usedvoidonRemoval(@Nullable String key, @Nullable List<org.apache.storm.tuple.Tuple> value, @NotNull com.github.benmanes.caffeine.cache.RemovalCause cause) voidprepare(Map<String, Object> stormConf, org.apache.storm.task.TopologyContext context, org.apache.storm.task.OutputCollector collector) voidstore(String url, Status status, Metadata metadata, Optional<Date> nextFetch, org.apache.storm.tuple.Tuple tuple) Methods inherited from class com.digitalpebble.stormcrawler.persistence.AbstractStatusUpdaterBolt
ack, declareOutputFields, execute, getDocumentIDMethods inherited from class org.apache.storm.topology.base.BaseComponent
getComponentConfigurationMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.storm.topology.IComponent
getComponentConfiguration
-
Constructor Details
-
StatusUpdaterBolt
public StatusUpdaterBolt() -
StatusUpdaterBolt
Loads the configuration using a substring different from the default value 'status' in order to distinguish it from the spout configurations
-
-
Method Details
-
prepare
public void prepare(Map<String, Object> stormConf, org.apache.storm.task.TopologyContext context, org.apache.storm.task.OutputCollector collector) - Specified by:
preparein interfaceorg.apache.storm.task.IBolt- Overrides:
preparein classAbstractStatusUpdaterBolt
-
cleanup
public void cleanup()- Specified by:
cleanupin interfaceorg.apache.storm.task.IBolt- Overrides:
cleanupin classorg.apache.storm.topology.base.BaseRichBolt
-
store
public void store(String url, Status status, Metadata metadata, Optional<Date> nextFetch, org.apache.storm.tuple.Tuple tuple) throws Exception - Specified by:
storein classAbstractStatusUpdaterBolt- Throws:
Exception
-
onRemoval
-
afterBulk
public void afterBulk(long executionId, org.opensearch.action.bulk.BulkRequest request, org.opensearch.action.bulk.BulkResponse response) - Specified by:
afterBulkin interfaceorg.opensearch.action.bulk.BulkProcessor.Listener
-
afterBulk
public void afterBulk(long executionId, org.opensearch.action.bulk.BulkRequest request, Throwable throwable) - Specified by:
afterBulkin interfaceorg.opensearch.action.bulk.BulkProcessor.Listener
-
beforeBulk
public void beforeBulk(long executionId, org.opensearch.action.bulk.BulkRequest request) - Specified by:
beforeBulkin interfaceorg.opensearch.action.bulk.BulkProcessor.Listener
-
getIndexName
Must be overridden for implementing custom index names based on some metadata information By Default, indexName coming from config is used
-