Class FSDocumentSelector

  • All Implemented Interfaces:
    org.apache.tika.extractor.DocumentSelector

    public class FSDocumentSelector
    extends Object
    implements org.apache.tika.extractor.DocumentSelector
    Selector that chooses files based on their file name and their size, as determined by TikaCoreProperties.RESOURCE_NAME_KEY and Metadata.CONTENT_LENGTH.

    The excludeFileName pattern is applied first (if it isn't null). Then the includeFileName pattern is applied (if it isn't null), and finally, the size limit is applied if it is above 0.

    • Constructor Detail

      • FSDocumentSelector

        public FSDocumentSelector​(Pattern includeFileName,
                                  Pattern excludeFileName,
                                  long minFileSizeBytes,
                                  long maxFileSizeBytes)
    • Method Detail

      • select

        public boolean select​(org.apache.tika.metadata.Metadata metadata)
        Specified by:
        select in interface org.apache.tika.extractor.DocumentSelector