Class StepImpl

  • All Implemented Interfaces:
    java.lang.Iterable<Document>, java.lang.Runnable, java.util.Collection<Document>, java.util.concurrent.BlockingQueue<Document>, java.util.Queue<Document>, Active, Configurable, DeferredBuilding, Step
    Direct Known Subclasses:
    ScannerImpl

    public class StepImpl
    extends java.lang.Object
    implements Step
    The class that is used to run DocumentProcessors. This class takes care of the handling of the document ensures it is properly received and passed on. This class is not normally overridden, to implement custom processing logic write a class that implements DocumentProcessor and then build a stepImpl that uses an instance of your processor. Also note that one does not normally call build on a StepImpl or any of its subclasses. The builder for this class is provided to a PlanImpl.Builder so that the plan can validate the ordering of the steps and assemble the entire plan as an immutable DAG.

    IMPORTANT: no field in this class or it's subclasses should be mutated after the builder has been built unless it is sufficiently synchronized. Once built this class and all sub-classes should be thread safe.
    • Method Detail

      • getPatternForStep

        public static java.util.regex.Pattern getPatternForStep​(java.lang.String name)
      • spliterator

        public java.util.Spliterator<Document> spliterator()
        Specified by:
        spliterator in interface java.util.Collection<Document>
        Specified by:
        spliterator in interface java.lang.Iterable<Document>
      • isEmpty

        public boolean isEmpty()
        Specified by:
        isEmpty in interface java.util.Collection<Document>
      • element

        public Document element()
        Specified by:
        element in interface java.util.Queue<Document>
      • poll

        public Document poll​(long timeout,
                             java.util.concurrent.TimeUnit unit)
                      throws java.lang.InterruptedException
        Specified by:
        poll in interface java.util.concurrent.BlockingQueue<Document>
        Throws:
        java.lang.InterruptedException
      • parallelStream

        public java.util.stream.Stream<Document> parallelStream()
        Specified by:
        parallelStream in interface java.util.Collection<Document>
      • take

        public Document take()
                      throws java.lang.InterruptedException
        Specified by:
        take in interface java.util.concurrent.BlockingQueue<Document>
        Throws:
        java.lang.InterruptedException
      • clear

        public void clear()
        Specified by:
        clear in interface java.util.Collection<Document>
      • iterator

        public java.util.Iterator<Document> iterator()
        Specified by:
        iterator in interface java.util.Collection<Document>
        Specified by:
        iterator in interface java.lang.Iterable<Document>
      • containsAll

        public boolean containsAll​(java.util.Collection<?> c)
        Specified by:
        containsAll in interface java.util.Collection<Document>
      • toArray

        public <T> T[] toArray​(T[] a)
        Specified by:
        toArray in interface java.util.Collection<Document>
      • addAll

        public boolean addAll​(java.util.Collection<? extends Document> c)
        Specified by:
        addAll in interface java.util.Collection<Document>
      • remainingCapacity

        public int remainingCapacity()
        Specified by:
        remainingCapacity in interface java.util.concurrent.BlockingQueue<Document>
      • stream

        public java.util.stream.Stream<Document> stream()
        Specified by:
        stream in interface java.util.Collection<Document>
      • offer

        public boolean offer​(Document document,
                             long timeout,
                             java.util.concurrent.TimeUnit unit)
                      throws java.lang.InterruptedException
        Specified by:
        offer in interface java.util.concurrent.BlockingQueue<Document>
        Throws:
        java.lang.InterruptedException
      • offer

        public boolean offer​(Document document)
        Specified by:
        offer in interface java.util.concurrent.BlockingQueue<Document>
        Specified by:
        offer in interface java.util.Queue<Document>
      • poll

        public Document poll()
        Specified by:
        poll in interface java.util.Queue<Document>
      • drainTo

        public int drainTo​(java.util.Collection<? super Document> c,
                           int maxElements)
        Specified by:
        drainTo in interface java.util.concurrent.BlockingQueue<Document>
      • retainAll

        public boolean retainAll​(java.util.Collection<?> c)
        Specified by:
        retainAll in interface java.util.Collection<Document>
      • put

        public void put​(Document document)
                 throws java.lang.InterruptedException
        Attempt to send the document to this step blocking if the queue for this step is full. This method does NOT guarantee delivery however, and will return immediately if the destination step is shutting down.
        Specified by:
        put in interface java.util.concurrent.BlockingQueue<Document>
        Parameters:
        document - the element to add
        Throws:
        java.lang.InterruptedException - if interrupted while waiting
      • peek

        public Document peek()
        Specified by:
        peek in interface java.util.Queue<Document>
      • size

        public int size()
        Specified by:
        size in interface java.util.Collection<Document>
      • contains

        public boolean contains​(java.lang.Object o)
        Specified by:
        contains in interface java.util.concurrent.BlockingQueue<Document>
        Specified by:
        contains in interface java.util.Collection<Document>
      • remove

        public boolean remove​(java.lang.Object o)
        Specified by:
        remove in interface java.util.concurrent.BlockingQueue<Document>
        Specified by:
        remove in interface java.util.Collection<Document>
      • removeAll

        public boolean removeAll​(java.util.Collection<?> c)
        Specified by:
        removeAll in interface java.util.Collection<Document>
      • add

        public boolean add​(Document document)
        Specified by:
        add in interface java.util.concurrent.BlockingQueue<Document>
        Specified by:
        add in interface java.util.Collection<Document>
        Specified by:
        add in interface java.util.Queue<Document>
      • forEach

        public void forEach​(java.util.function.Consumer<? super Document> action)
        Specified by:
        forEach in interface java.lang.Iterable<Document>
      • remove

        public Document remove()
        Specified by:
        remove in interface java.util.Queue<Document>
      • toArray

        public java.lang.Object[] toArray()
        Specified by:
        toArray in interface java.util.Collection<Document>
      • removeIf

        public boolean removeIf​(java.util.function.Predicate<? super Document> filter)
        Specified by:
        removeIf in interface java.util.Collection<Document>
      • drainTo

        public int drainTo​(java.util.Collection<? super Document> c)
        Specified by:
        drainTo in interface java.util.concurrent.BlockingQueue<Document>
      • getBatchSize

        public int getBatchSize()
        Description copied from interface: Step
        Set the number of items to process concurrently.
        Specified by:
        getBatchSize in interface Step
        Returns:
        the batch size.
      • getNextSteps

        public NextSteps getNextSteps​(Document doc)
        Description copied from interface: Step
        Get the next step in the plan for the given document
        Specified by:
        getNextSteps in interface Step
        Parameters:
        doc - the document for which a next step should be determined.
        Returns:
        the getNext step
      • getPlan

        public Plan getPlan()
        Description copied from interface: Step
        Get the plan instance to which this step belongs.
        Specified by:
        getPlan in interface Step
        Returns:
        the plan (not a man, not a canal, not panama)
      • activate

        public void activate()
        Description copied from interface: Active
        Begin processing. This is the on switch.
        Specified by:
        activate in interface Active
      • deactivate

        public void deactivate()
        Description copied from interface: Active
        Stop processing. This is the stop switch.
        Specified by:
        deactivate in interface Active
      • isActive

        public boolean isActive()
        Test if the step is active and should be processing. It is a good idea for operations running in the worker thread to check this method in loops and before operations that could block or take a long time. Doing so promotes timely shutdown.
        Specified by:
        isActive in interface Active
        Returns:
        true if processing should continue false if the worker thread is trying to stop.
      • sendToNext

        public void sendToNext​(Document doc)
        Description copied from interface: Step
        After processing is complete, send it on to any subsequent steps if appropriate. This method may inspect the document status and if the document is not dropped, errored, etc. and there are multiple possible destination steps it should invoke the router to determine the appropriate destinations and conduct the submission of the results to the indicated steps.
        Specified by:
        sendToNext in interface Step
        Parameters:
        doc - The document for which processing is complete.
      • getOutputDestinationNames

        public java.util.Set<java.lang.String> getOutputDestinationNames()
        Specified by:
        getOutputDestinationNames in interface Step
      • getDownstreamOutputSteps

        public java.util.Set<Step> getDownstreamOutputSteps()
        Description copied from interface: Step
        Identify the downstream steps that must only be executed once per document.
        Specified by:
        getDownstreamOutputSteps in interface Step
        Returns:
        The steps downstream from this one that are neither safe nor idempotent.
      • isOutputStep

        public boolean isOutputStep()
        Specified by:
        isOutputStep in interface Step
      • getNextSteps

        public java.util.LinkedHashMap<java.lang.String,​Step> getNextSteps()
        Description copied from interface: Step
        The steps that are reachable from this step.
        Specified by:
        getNextSteps in interface Step
        Returns:
        A map of steps keyed by their names.
      • getEligibleNextSteps

        public java.util.LinkedHashMap<java.lang.String,​Step> getEligibleNextSteps​(Document d)
        Description copied from interface: Step
        The steps that are reachable from this step and lead to at least one destination valid for the document.
        Specified by:
        getEligibleNextSteps in interface Step
        Returns:
        A map of steps keyed by their names.
      • isActivePriorSteps

        public boolean isActivePriorSteps()
        Description copied from interface: Step
        Determine if any upstream steps are still active. A true result implies that documents may yet be received for processing, and it is not safe to shut down the processing thread for this step.
        Specified by:
        isActivePriorSteps in interface Step
        Returns:
        true if any immediately prior steps are still active
      • getPriorSteps

        public java.util.List<Step> getPriorSteps()
        Specified by:
        getPriorSteps in interface Step
      • run

        public void run()
        Specified by:
        run in interface java.lang.Runnable
      • getName

        public java.lang.String getName()
        Description copied from interface: Configurable
        A name for this object to distinguish it from other objects. This value is generally supplied by the plan author. Every object in a plan must have a unique name, begin with a letter and only contain letters, digits, underscores and periods.
        Specified by:
        getName in interface Configurable
        Returns:
        The user supplied name for this step
      • getLogger

        protected org.apache.logging.log4j.Logger getLogger()
      • reportException

        protected void reportException​(java.util.Map.Entry<Step,​NextSteps.StepStatusHolder> entry,
                                       java.lang.String message,
                                       java.lang.Object... params)
      • addDeferred

        public void addDeferred​(java.lang.Runnable builderAction)
        Specified by:
        addDeferred in interface DeferredBuilding
      • addPredecessor

        public void addPredecessor​(StepImpl obj)
        Description copied from interface: Step
        Register a step as a predecessor of this step (one that might send documents to this step).
        Specified by:
        addPredecessor in interface Step
        Parameters:
        obj - The step to register as a potential upstream source of documents.
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object