Class FSDocumentSelector

java.lang.Object
org.apache.tika.batch.fs.FSDocumentSelector
All Implemented Interfaces:
org.apache.tika.extractor.DocumentSelector

public class FSDocumentSelector extends Object implements org.apache.tika.extractor.DocumentSelector
Selector that chooses files based on their file name and their size, as determined by TikaCoreProperties.RESOURCE_NAME_KEY and Metadata.CONTENT_LENGTH.

The excludeFileName pattern is applied first (if it isn't null). Then the includeFileName pattern is applied (if it isn't null), and finally, the size limit is applied if it is above 0.

  • Constructor Details

    • FSDocumentSelector

      public FSDocumentSelector(Pattern includeFileName, Pattern excludeFileName, long minFileSizeBytes, long maxFileSizeBytes)
  • Method Details

    • select

      public boolean select(org.apache.tika.metadata.Metadata metadata)
      Specified by:
      select in interface org.apache.tika.extractor.DocumentSelector