java.lang.Object
org.apache.storm.topology.base.BaseComponent
org.apache.storm.topology.base.BaseRichBolt
com.digitalpebble.stormcrawler.persistence.AbstractStatusUpdaterBolt
com.digitalpebble.stormcrawler.opensearch.persistence.StatusUpdaterBolt
All Implemented Interfaces:
com.github.benmanes.caffeine.cache.RemovalListener<String,List<org.apache.storm.tuple.Tuple>>, Serializable, org.apache.storm.task.IBolt, org.apache.storm.topology.IComponent, org.apache.storm.topology.IRichBolt, org.opensearch.action.bulk.BulkProcessor.Listener

public class StatusUpdaterBolt extends AbstractStatusUpdaterBolt implements com.github.benmanes.caffeine.cache.RemovalListener<String,List<org.apache.storm.tuple.Tuple>>, org.opensearch.action.bulk.BulkProcessor.Listener
Simple bolt which stores the status of URLs into ElasticSearch. Takes the tuples coming from the 'status' stream. To be used in combination with a Spout to read from the index.
See Also:
  • Constructor Details

    • StatusUpdaterBolt

      public StatusUpdaterBolt()
    • StatusUpdaterBolt

      public StatusUpdaterBolt(String boltType)
      Loads the configuration using a substring different from the default value 'status' in order to distinguish it from the spout configurations
  • Method Details

    • prepare

      public void prepare(Map<String,Object> stormConf, org.apache.storm.task.TopologyContext context, org.apache.storm.task.OutputCollector collector)
      Specified by:
      prepare in interface org.apache.storm.task.IBolt
      Overrides:
      prepare in class AbstractStatusUpdaterBolt
    • cleanup

      public void cleanup()
      Specified by:
      cleanup in interface org.apache.storm.task.IBolt
      Overrides:
      cleanup in class org.apache.storm.topology.base.BaseRichBolt
    • store

      public void store(String url, Status status, Metadata metadata, Optional<Date> nextFetch, org.apache.storm.tuple.Tuple tuple) throws Exception
      Specified by:
      store in class AbstractStatusUpdaterBolt
      Throws:
      Exception
    • onRemoval

      public void onRemoval(@Nullable @Nullable String key, @Nullable @Nullable List<org.apache.storm.tuple.Tuple> value, @NotNull @NotNull com.github.benmanes.caffeine.cache.RemovalCause cause)
      Specified by:
      onRemoval in interface com.github.benmanes.caffeine.cache.RemovalListener<String,List<org.apache.storm.tuple.Tuple>>
    • afterBulk

      public void afterBulk(long executionId, org.opensearch.action.bulk.BulkRequest request, org.opensearch.action.bulk.BulkResponse response)
      Specified by:
      afterBulk in interface org.opensearch.action.bulk.BulkProcessor.Listener
    • afterBulk

      public void afterBulk(long executionId, org.opensearch.action.bulk.BulkRequest request, Throwable throwable)
      Specified by:
      afterBulk in interface org.opensearch.action.bulk.BulkProcessor.Listener
    • beforeBulk

      public void beforeBulk(long executionId, org.opensearch.action.bulk.BulkRequest request)
      Specified by:
      beforeBulk in interface org.opensearch.action.bulk.BulkProcessor.Listener
    • getIndexName

      protected String getIndexName(Metadata m)
      Must be overridden for implementing custom index names based on some metadata information By Default, indexName coming from config is used