Serialized Form

  • Package com.digitalpebble.stormcrawler.tika

    • Class com.digitalpebble.stormcrawler.tika.ParserBolt

      class ParserBolt extends org.apache.storm.topology.base.BaseRichBolt implements Serializable
      • Serialized Fields

        • collector
          org.apache.storm.task.OutputCollector collector
        • emitOutlinks
          boolean emitOutlinks
        • eventCounter
          org.apache.storm.metric.api.MultiCountMetric eventCounter
        • extractEmbedded
          boolean extractEmbedded
        • htmlMapperClass
          Class<? extends org.apache.tika.parser.html.HtmlMapper> htmlMapperClass
        • metadataTransfer
          MetadataTransfer metadataTransfer
        • mimeTypeWhiteList
          List<String> mimeTypeWhiteList
          regular expressions to apply to the mime-type *
        • parseFilters
          ParseFilter parseFilters
        • protocolMDprefix
          String protocolMDprefix
        • tika
          org.apache.tika.Tika tika
        • upperCaseElementNames
          boolean upperCaseElementNames
        • urlFilters
          URLFilters urlFilters
    • Class com.digitalpebble.stormcrawler.tika.RedirectionBolt

      class RedirectionBolt extends org.apache.storm.topology.base.BaseRichBolt implements Serializable
      • Serialized Fields

        • collector
          org.apache.storm.task.OutputCollector collector