Class RedirectionBolt
java.lang.Object
org.apache.storm.topology.base.BaseComponent
org.apache.storm.topology.base.BaseRichBolt
com.digitalpebble.stormcrawler.tika.RedirectionBolt
- All Implemented Interfaces:
Serializable,org.apache.storm.task.IBolt,org.apache.storm.topology.IComponent,org.apache.storm.topology.IRichBolt
public class RedirectionBolt
extends org.apache.storm.topology.base.BaseRichBolt
Uses Tika only if a document has not been parsed with anything else. Emits the tuples to be
processed with Tika on a stream of the same name ('tika').
Remember to set
jsoup.treat.non.html.as.error: falseUse in your topologies as follows :
builder.setBolt("jsoup", new JSoupParserBolt()).localOrShuffleGrouping(
"sitemap");
builder.setBolt("shunt", new RedirectionBolt()).localOrShuffleGrouping("jsoup");
builder.setBolt("tika", new ParserBolt()).localOrShuffleGrouping("shunt",
"tika");
builder.setBolt("indexer", new IndexingBolt(), numWorkers)
.localOrShuffleGrouping("shunt").localOrShuffleGrouping("tika");
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoiddeclareOutputFields(org.apache.storm.topology.OutputFieldsDeclarer declarer) voidexecute(org.apache.storm.tuple.Tuple tuple) voidprepare(Map conf, org.apache.storm.task.TopologyContext context, org.apache.storm.task.OutputCollector collector) Methods inherited from class org.apache.storm.topology.base.BaseRichBolt
cleanupMethods inherited from class org.apache.storm.topology.base.BaseComponent
getComponentConfigurationMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.storm.topology.IComponent
getComponentConfiguration
-
Constructor Details
-
RedirectionBolt
public RedirectionBolt()
-
-
Method Details
-
prepare
public void prepare(Map conf, org.apache.storm.task.TopologyContext context, org.apache.storm.task.OutputCollector collector) -
execute
public void execute(org.apache.storm.tuple.Tuple tuple) -
declareOutputFields
public void declareOutputFields(org.apache.storm.topology.OutputFieldsDeclarer declarer)
-