Class MetadataIndexer

  • All Implemented Interfaces:
    ai.platon.pulsar.common.config.Configurable , ai.platon.pulsar.common.config.Parameterized , ai.platon.pulsar.skeleton.crawl.common.LazyConfigurable , ai.platon.pulsar.skeleton.crawl.index.IndexingFilter

    
    public final class MetadataIndexer
     implements IndexingFilter
                        

    Indexer which can be configured to extract metadata from the crawldb, parse metadata or content metadata. You can specify the properties "index.db", "index.parse" or "index.content" who's values are comma-delimited <value>key1,key2,key3</value>.

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private ImmutableConfig conf
    • Constructor Summary

      Constructors 
      Constructor Description
      MetadataIndexer(ImmutableConfig conf)
    • Enum Constant Summary

      Enum Constants 
      Enum Constant Description
    • Method Summary

      Modifier and Type Method Description
      ImmutableConfig getConf()
      Unit setConf(ImmutableConfig conf)
      Unit configure(ImmutableConfig conf1)
      IndexDocument filter(IndexDocument doc, String url, WebPage page)
      • Methods inherited from class ai.platon.pulsar.common.config.Parameterized

        getParams
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • MetadataIndexer

        MetadataIndexer(ImmutableConfig conf)