All Classes
-
All Classes Interface Summary Class Summary Enum Summary Error Summary Class Description AbstractConsumersBuilder AbstractFSConsumer AutoDetectParserFactory Simple class for AutoDetectParserBasicTikaFSConsumer Basic FileResourceConsumer that reads files from an input directory and writes content to the output directory.BasicTikaFSConsumersBuilder BatchNoRestartError FileResourceConsumers should throw this if something catastrophic has happened and the BatchProcess should shutdown and not be restarted.BatchProcess This is the main processor class for a single process.BatchProcess.BATCH_CONSTANTS BatchProcessBuilder Builds a BatchProcessor from a combination of runtime arguments and the config file.BatchProcessDriverCLI ClassLoaderUtil CommandLineParserBuilder Reads configurable options from a config file and returns org.apache.commons.cli.Options object to be used in commandline parser.ConsumersManager Simple interface around a collection of consumers that allows for initializing and shutting shared resources (e.g.DefaultContentHandlerFactoryBuilder Builds BasicContentHandler with type defined by attribute "basicHandlerType" with possible values: xml, html, text, body, ignore.DurationFormatUtils Functionality and naming conventions (roughly) copied from org.apache.commons.lang3 so that we didn't have to add another dependency.FileResource This is a basic interface to handle a logical "file".FileResourceConsumer This is a base class for file consumers.FileResourceCrawler FSBatchProcessCLI FSConsumersManager FSCrawlerBuilder Builds either an FSDirectoryCrawler or an FSListCrawler.FSDirectoryCrawler FSDirectoryCrawler.CRAWL_ORDER FSDocumentSelector Selector that chooses files based on their file name and their size, as determined by TikaCoreProperties.RESOURCE_NAME_KEY and Metadata.CONTENT_LENGTH.FSFileResource FileSystem(FS)Resource wraps a file name.FSListCrawler Class that "crawls" a list of files.FSOutputStreamFactory FSOutputStreamFactory.COMPRESSION FSProperties FSUtil Utility class to handle some common issues when reading from and writing to a file system (FS).FSUtil.HANDLE_EXISTING IContentHandlerFactoryBuilder ICrawlerBuilder IFileProcessorFutureResult stub interface to allow for different result types from different processorsInterrupter Class that waits for input on System.in.InterrupterBuilder Builds an InterrupterInterrupterFutureResult IParserFactoryBuilder ObjectFromDOMAndQueueBuilder<T> Same asObjectFromDOMAndQueueBuilder, but this is for objects that require access to the shared queue.ObjectFromDOMBuilder<T> Interface for things that build objects from a DOM Node and a map of runtime attributesOutputStreamFactory ParallelFileProcessingResult ParserFactory ParserFactoryBuilder PropsUtil Utility class to handle properties.RecursiveParserWrapperFSConsumer This runs a RecursiveParserWrapper against an input file and outputs the json metadata to an output file.ReporterBuilder Interface for reporter buildersSimpleLogReporterBuilder StatusReporter Basic class to use for reporting status from both the crawler and the consumers.StatusReporterBuilder StatusReporterFutureResult Empty class for what a StatusReporter returns when it finishes.StrawManTikaAppDriver Simple single-threaded class that calls tika-app against every file in a directory.StreamOutRPWFSConsumer This uses theJsonStreamingSerializerto write out a single metadata object at a time.XMLDOMUtil