Class GeneralIndexingFilter

  • All Implemented Interfaces:
    ai.platon.pulsar.common.config.Configurable , ai.platon.pulsar.common.config.Parameterized , ai.platon.pulsar.skeleton.crawl.common.LazyConfigurable , ai.platon.pulsar.skeleton.crawl.index.IndexingFilter

    
    public final class GeneralIndexingFilter
     implements IndexingFilter
                        

    Adds basic searchable fields to a document.

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private ImmutableConfig conf
    • Enum Constant Summary

      Enum Constants 
      Enum Constant Description
    • Method Summary

      Modifier and Type Method Description
      ImmutableConfig getConf()
      Unit setConf(ImmutableConfig conf)
      Unit configure(ImmutableConfig conf1)
      IndexDocument filter(IndexDocument doc, String url, WebPage page)
      • Methods inherited from class ai.platon.pulsar.common.config.Parameterized

        getParams
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • GeneralIndexingFilter

        GeneralIndexingFilter(ImmutableConfig conf)
    • Method Detail

      • getConf

         ImmutableConfig getConf()
      • filter

         IndexDocument filter(IndexDocument doc, String url, WebPage page)
        Parameters:
        doc - The IndexDocument object
        url - URL to be filtered for anchor text
        page - WebPage object relative to the URL
        Returns:

        filtered IndexDocument