Class SimpleFileScanner

  • All Implemented Interfaces:
    java.lang.Iterable<Document>, java.lang.Runnable, java.util.Collection<Document>, java.util.concurrent.BlockingQueue<Document>, java.util.Queue<Document>, Active, Configurable, DeferredBuilding, Scanner, Step, FileScanner

    public class SimpleFileScanner
    extends ScannerImpl
    implements FileScanner
    Scanner for local filesystems. This scanner periodically does a full walk of the filesystem. No persistent record of files detected during walking is kept, and all files will be visited on each scan, so it is highly recommended to use this with the remembering option turned on unless a regular full re-index is desired. If walking the filesystem takes longer than the scan interval, the time to walk will determine the index latency instead. This scanner will not start a new scan until the current one completes. Files to be processed must fit in JVM memory.
    • Constructor Detail

      • SimpleFileScanner

        protected SimpleFileScanner()
    • Method Detail

      • getScanOperation

        public ScannerImpl.ScanOp getScanOperation()
        Description copied from class: ScannerImpl
        The default scan operation is to check the cassandra database for records marked dirty or restart and process those records using the scanner's document fetching logic (empty by default)
        Specified by:
        getScanOperation in interface Scanner
        Specified by:
        getScanOperation in class ScannerImpl
        Returns:
        a Runnable object that locates documents.
      • isScanning

        public boolean isScanning()
        Description copied from interface: Scanner
        True if a new scan may be started. Implementations may choose not to start a new scan until the old one has completed. This value is independent of Active.isActive().
        Specified by:
        isScanning in interface Scanner
        Returns:
        true if a new scan should be started
      • fetchById

        public java.util.Optional<Document> fetchById​(java.lang.String id,
                                                      java.lang.String origination)
        Description copied from interface: Scanner
        Load a document based on the document's id.
        Specified by:
        fetchById in interface Scanner
        Parameters:
        id - the id of the document, see also Document.getId()
        origination - A constant indicating the source (scanner or fti) for debugging
        Returns:
        An optional that contains the document if it is possible to retrieve the document by ID
      • setScanning

        protected void setScanning​(boolean scanning)