java.lang.Object
org.apache.storm.topology.base.BaseComponent
org.apache.storm.topology.base.BaseRichBolt
com.digitalpebble.stormcrawler.opensearch.bolt.DeletionBolt
All Implemented Interfaces:
com.github.benmanes.caffeine.cache.RemovalListener<String,List<org.apache.storm.tuple.Tuple>>, Serializable, org.apache.storm.task.IBolt, org.apache.storm.topology.IComponent, org.apache.storm.topology.IRichBolt, org.opensearch.action.bulk.BulkProcessor.Listener

public class DeletionBolt extends org.apache.storm.topology.base.BaseRichBolt implements com.github.benmanes.caffeine.cache.RemovalListener<String,List<org.apache.storm.tuple.Tuple>>, org.opensearch.action.bulk.BulkProcessor.Listener
Deletes documents in OpenSearch. This should be connected to the StatusUpdaterBolt via the 'deletion' stream and will remove the documents with a status of ERROR. Note that this component will also try to delete documents even though they were never indexed and it currently won't delete documents which were indexed under the canonical URL.
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
     
    DeletionBolt(String indexName)
    Sets the index name instead of taking it from the configuration.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    afterBulk(long executionId, org.opensearch.action.bulk.BulkRequest request, Throwable failure)
     
    void
    afterBulk(long executionId, org.opensearch.action.bulk.BulkRequest request, org.opensearch.action.bulk.BulkResponse response)
     
    void
    beforeBulk(long executionId, org.opensearch.action.bulk.BulkRequest request)
     
    void
     
    void
    declareOutputFields(org.apache.storm.topology.OutputFieldsDeclarer arg0)
     
    void
    execute(org.apache.storm.tuple.Tuple tuple)
     
    protected String
    Must be overridden for implementing custom index names based on some metadata information By Default, indexName coming from config is used
    void
    onRemoval(@Nullable String key, @Nullable List<org.apache.storm.tuple.Tuple> value, @NotNull com.github.benmanes.caffeine.cache.RemovalCause cause)
     
    void
    prepare(Map<String,Object> conf, org.apache.storm.task.TopologyContext context, org.apache.storm.task.OutputCollector collector)
     

    Methods inherited from class org.apache.storm.topology.base.BaseComponent

    getComponentConfiguration

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.apache.storm.topology.IComponent

    getComponentConfiguration
  • Constructor Details

    • DeletionBolt

      public DeletionBolt()
    • DeletionBolt

      public DeletionBolt(String indexName)
      Sets the index name instead of taking it from the configuration. *
  • Method Details

    • prepare

      public void prepare(Map<String,Object> conf, org.apache.storm.task.TopologyContext context, org.apache.storm.task.OutputCollector collector)
      Specified by:
      prepare in interface org.apache.storm.task.IBolt
    • onRemoval

      public void onRemoval(@Nullable @Nullable String key, @Nullable @Nullable List<org.apache.storm.tuple.Tuple> value, @NotNull @NotNull com.github.benmanes.caffeine.cache.RemovalCause cause)
      Specified by:
      onRemoval in interface com.github.benmanes.caffeine.cache.RemovalListener<String,List<org.apache.storm.tuple.Tuple>>
    • cleanup

      public void cleanup()
      Specified by:
      cleanup in interface org.apache.storm.task.IBolt
      Overrides:
      cleanup in class org.apache.storm.topology.base.BaseRichBolt
    • execute

      public void execute(org.apache.storm.tuple.Tuple tuple)
      Specified by:
      execute in interface org.apache.storm.task.IBolt
    • declareOutputFields

      public void declareOutputFields(org.apache.storm.topology.OutputFieldsDeclarer arg0)
      Specified by:
      declareOutputFields in interface org.apache.storm.topology.IComponent
    • getIndexName

      protected String getIndexName(Metadata m)
      Must be overridden for implementing custom index names based on some metadata information By Default, indexName coming from config is used
    • beforeBulk

      public void beforeBulk(long executionId, org.opensearch.action.bulk.BulkRequest request)
      Specified by:
      beforeBulk in interface org.opensearch.action.bulk.BulkProcessor.Listener
    • afterBulk

      public void afterBulk(long executionId, org.opensearch.action.bulk.BulkRequest request, org.opensearch.action.bulk.BulkResponse response)
      Specified by:
      afterBulk in interface org.opensearch.action.bulk.BulkProcessor.Listener
    • afterBulk

      public void afterBulk(long executionId, org.opensearch.action.bulk.BulkRequest request, Throwable failure)
      Specified by:
      afterBulk in interface org.opensearch.action.bulk.BulkProcessor.Listener