Index
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form
A
- AbstractConsumersBuilder - Class in org.apache.tika.batch.builders
- AbstractConsumersBuilder() - Constructor for class org.apache.tika.batch.builders.AbstractConsumersBuilder
- AbstractFSConsumer - Class in org.apache.tika.batch.fs
- AbstractFSConsumer(ArrayBlockingQueue<FileResource>) - Constructor for class org.apache.tika.batch.fs.AbstractFSConsumer
- ADDED - Static variable in class org.apache.tika.batch.FileResourceCrawler
- AutoDetectParserFactory - Class in org.apache.tika.batch
-
Simple class for AutoDetectParser
- AutoDetectParserFactory() - Constructor for class org.apache.tika.batch.AutoDetectParserFactory
B
- BasicTikaFSConsumer - Class in org.apache.tika.batch.fs
-
Basic FileResourceConsumer that reads files from an input directory and writes content to the output directory.
- BasicTikaFSConsumer(ArrayBlockingQueue<FileResource>, ParserFactory, ContentHandlerFactory, OutputStreamFactory, TikaConfig) - Constructor for class org.apache.tika.batch.fs.BasicTikaFSConsumer
-
Deprecated.
- BasicTikaFSConsumer(ArrayBlockingQueue<FileResource>, Parser, ContentHandlerFactory, OutputStreamFactory) - Constructor for class org.apache.tika.batch.fs.BasicTikaFSConsumer
- BasicTikaFSConsumersBuilder - Class in org.apache.tika.batch.fs.builders
- BasicTikaFSConsumersBuilder() - Constructor for class org.apache.tika.batch.fs.builders.BasicTikaFSConsumersBuilder
- BATCH_PROCESS_EXCEEDED_MAX_ALIVE_TIME - Enum constant in enum org.apache.tika.batch.BatchProcess.BATCH_CONSTANTS
- BATCH_PROCESS_FATAL_MUST_RESTART - Enum constant in enum org.apache.tika.batch.BatchProcess.BATCH_CONSTANTS
- BatchNoRestartError - Error in org.apache.tika.batch
-
FileResourceConsumers should throw this if something catastrophic has happened and the BatchProcess should shutdown and not be restarted.
- BatchNoRestartError(String) - Constructor for error org.apache.tika.batch.BatchNoRestartError
- BatchNoRestartError(String, Throwable) - Constructor for error org.apache.tika.batch.BatchNoRestartError
- BatchNoRestartError(Throwable) - Constructor for error org.apache.tika.batch.BatchNoRestartError
- BatchProcess - Class in org.apache.tika.batch
-
This is the main processor class for a single process.
- BatchProcess(FileResourceCrawler, ConsumersManager, StatusReporter, Interrupter) - Constructor for class org.apache.tika.batch.BatchProcess
- BatchProcess.BATCH_CONSTANTS - Enum in org.apache.tika.batch
- BatchProcessBuilder - Class in org.apache.tika.batch.builders
-
Builds a BatchProcessor from a combination of runtime arguments and the config file.
- BatchProcessBuilder() - Constructor for class org.apache.tika.batch.builders.BatchProcessBuilder
- BatchProcessDriverCLI - Class in org.apache.tika.batch
- BatchProcessDriverCLI(String[]) - Constructor for class org.apache.tika.batch.BatchProcessDriverCLI
- build(InputStream) - Method in class org.apache.tika.batch.builders.CommandLineParserBuilder
- build(InputStream, Map<String, String>) - Method in class org.apache.tika.batch.builders.BatchProcessBuilder
-
Builds a BatchProcess from runtime arguments and a input stream of a configuration file.
- build(FileResourceCrawler, ConsumersManager, Node, Map<String, String>) - Method in class org.apache.tika.batch.builders.SimpleLogReporterBuilder
- build(FileResourceCrawler, ConsumersManager, Node, Map<String, String>) - Method in interface org.apache.tika.batch.builders.StatusReporterBuilder
- build(Node, long, Map<String, String>) - Method in class org.apache.tika.batch.builders.InterrupterBuilder
- build(Node, Map<String, String>) - Method in class org.apache.tika.batch.builders.BatchProcessBuilder
-
Builds a FileResourceBatchProcessor from runtime arguments and a document node of a configuration file.
- build(Node, Map<String, String>) - Method in class org.apache.tika.batch.builders.DefaultContentHandlerFactoryBuilder
- build(Node, Map<String, String>) - Method in interface org.apache.tika.batch.builders.IContentHandlerFactoryBuilder
- build(Node, Map<String, String>) - Method in interface org.apache.tika.batch.builders.IParserFactoryBuilder
- build(Node, Map<String, String>) - Method in interface org.apache.tika.batch.builders.ObjectFromDOMBuilder
- build(Node, Map<String, String>) - Method in class org.apache.tika.batch.builders.ParserFactoryBuilder
- build(Node, Map<String, String>) - Method in interface org.apache.tika.batch.builders.ReporterBuilder
- build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in class org.apache.tika.batch.builders.AbstractConsumersBuilder
- build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in interface org.apache.tika.batch.builders.ICrawlerBuilder
- build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in interface org.apache.tika.batch.builders.ObjectFromDOMAndQueueBuilder
- build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in class org.apache.tika.batch.fs.builders.BasicTikaFSConsumersBuilder
- build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in class org.apache.tika.batch.fs.builders.FSCrawlerBuilder
- buildClass(Class<T>, String) - Static method in class org.apache.tika.util.ClassLoaderUtil
- BZIP2 - Enum constant in enum org.apache.tika.batch.fs.FSOutputStreamFactory.COMPRESSION
C
- call() - Method in class org.apache.tika.batch.BatchProcess
-
Runs main execution loop.
- call() - Method in class org.apache.tika.batch.FileResourceConsumer
- call() - Method in class org.apache.tika.batch.FileResourceCrawler
- call() - Method in class org.apache.tika.batch.fs.strawman.StrawManTikaAppDriver
- call() - Method in class org.apache.tika.batch.Interrupter
- call() - Method in class org.apache.tika.batch.StatusReporter
-
Startup the reporter.
- checkForTimedOutMillis(long) - Method in class org.apache.tika.batch.FileResourceConsumer
-
Checks to see if the currentFile being processed (if there is one) should be timed out (still being worked on after staleThresholdMillis).
- checkThisIsAncestorOfOrSameAsThat(File, File) - Static method in class org.apache.tika.batch.fs.FSUtil
-
Deprecated.
- checkThisIsAncestorOfThat(File, File) - Static method in class org.apache.tika.batch.fs.FSUtil
-
Deprecated.
- ClassLoaderUtil - Class in org.apache.tika.util
- ClassLoaderUtil() - Constructor for class org.apache.tika.util.ClassLoaderUtil
- close(Closeable) - Method in class org.apache.tika.batch.FileResourceConsumer
- CommandLineParserBuilder - Class in org.apache.tika.batch.builders
-
Reads configurable options from a config file and returns org.apache.commons.cli.Options object to be used in commandline parser.
- CommandLineParserBuilder() - Constructor for class org.apache.tika.batch.builders.CommandLineParserBuilder
- ConsumersManager - Class in org.apache.tika.batch
-
Simple interface around a collection of consumers that allows for initializing and shutting shared resources (e.g.
- ConsumersManager(List<FileResourceConsumer>) - Constructor for class org.apache.tika.batch.ConsumersManager
D
- DEFAULT_MAX_QUEUE_SIZE - Static variable in class org.apache.tika.batch.builders.BatchProcessBuilder
- DefaultContentHandlerFactoryBuilder - Class in org.apache.tika.batch.builders
-
Builds BasicContentHandler with type defined by attribute "basicHandlerType" with possible values: xml, html, text, body, ignore.
- DefaultContentHandlerFactoryBuilder() - Constructor for class org.apache.tika.batch.builders.DefaultContentHandlerFactoryBuilder
- DurationFormatUtils - Class in org.apache.tika.util
-
Functionality and naming conventions (roughly) copied from org.apache.commons.lang3 so that we didn't have to add another dependency.
- DurationFormatUtils() - Constructor for class org.apache.tika.util.DurationFormatUtils
E
- ELAPSED_MILLIS - Static variable in class org.apache.tika.batch.FileResourceConsumer
- execute() - Method in class org.apache.tika.batch.BatchProcessDriverCLI
F
- FILE_EXTENSION - Static variable in interface org.apache.tika.batch.FileResource
- FileResource - Interface in org.apache.tika.batch
-
This is a basic interface to handle a logical "file".
- FileResourceConsumer - Class in org.apache.tika.batch
-
This is a base class for file consumers.
- FileResourceConsumer(ArrayBlockingQueue<FileResource>) - Constructor for class org.apache.tika.batch.FileResourceConsumer
- FileResourceCrawler - Class in org.apache.tika.batch
- FileResourceCrawler(ArrayBlockingQueue<FileResource>, int) - Constructor for class org.apache.tika.batch.FileResourceCrawler
- FINISHED_STRING - Static variable in class org.apache.tika.batch.fs.FSBatchProcessCLI
- flushAndClose(Closeable) - Method in class org.apache.tika.batch.FileResourceConsumer
- formatMillis(long) - Static method in class org.apache.tika.util.DurationFormatUtils
- FS_REL_PATH - Static variable in class org.apache.tika.batch.fs.FSProperties
-
File's relative path (including file name) from a given source root
- FSBatchProcessCLI - Class in org.apache.tika.batch.fs
- FSBatchProcessCLI(String[]) - Constructor for class org.apache.tika.batch.fs.FSBatchProcessCLI
- FSConsumersManager - Class in org.apache.tika.batch.fs
- FSConsumersManager(List<FileResourceConsumer>) - Constructor for class org.apache.tika.batch.fs.FSConsumersManager
- FSCrawlerBuilder - Class in org.apache.tika.batch.fs.builders
-
Builds either an FSDirectoryCrawler or an FSListCrawler.
- FSCrawlerBuilder() - Constructor for class org.apache.tika.batch.fs.builders.FSCrawlerBuilder
- FSDirectoryCrawler - Class in org.apache.tika.batch.fs
- FSDirectoryCrawler(ArrayBlockingQueue<FileResource>, int, Path, Path, FSDirectoryCrawler.CRAWL_ORDER) - Constructor for class org.apache.tika.batch.fs.FSDirectoryCrawler
- FSDirectoryCrawler(ArrayBlockingQueue<FileResource>, int, Path, FSDirectoryCrawler.CRAWL_ORDER) - Constructor for class org.apache.tika.batch.fs.FSDirectoryCrawler
- FSDirectoryCrawler.CRAWL_ORDER - Enum in org.apache.tika.batch.fs
- FSDocumentSelector - Class in org.apache.tika.batch.fs
-
Selector that chooses files based on their file name and their size, as determined by TikaCoreProperties.RESOURCE_NAME_KEY and Metadata.CONTENT_LENGTH.
- FSDocumentSelector(Pattern, Pattern, long, long) - Constructor for class org.apache.tika.batch.fs.FSDocumentSelector
- FSFileResource - Class in org.apache.tika.batch.fs
-
FileSystem(FS)Resource wraps a file name.
- FSFileResource(File, File) - Constructor for class org.apache.tika.batch.fs.FSFileResource
-
Deprecated.to be removed in Tika 2.0
- FSFileResource(Path, Path) - Constructor for class org.apache.tika.batch.fs.FSFileResource
-
Constructor
- FSListCrawler - Class in org.apache.tika.batch.fs
-
Class that "crawls" a list of files.
- FSListCrawler(ArrayBlockingQueue<FileResource>, int, File, File, String) - Constructor for class org.apache.tika.batch.fs.FSListCrawler
-
Deprecated.
- FSListCrawler(ArrayBlockingQueue<FileResource>, int, Path, Path, Charset) - Constructor for class org.apache.tika.batch.fs.FSListCrawler
-
Constructor for a crawler that reads a list of files to process.
- FSOutputStreamFactory - Class in org.apache.tika.batch.fs
- FSOutputStreamFactory(File, FSUtil.HANDLE_EXISTING, FSOutputStreamFactory.COMPRESSION, String) - Constructor for class org.apache.tika.batch.fs.FSOutputStreamFactory
-
Deprecated.
- FSOutputStreamFactory(Path, FSUtil.HANDLE_EXISTING, FSOutputStreamFactory.COMPRESSION, String) - Constructor for class org.apache.tika.batch.fs.FSOutputStreamFactory
- FSOutputStreamFactory.COMPRESSION - Enum in org.apache.tika.batch.fs
- FSProperties - Class in org.apache.tika.batch.fs
- FSProperties() - Constructor for class org.apache.tika.batch.fs.FSProperties
- FSUtil - Class in org.apache.tika.batch.fs
-
Utility class to handle some common issues when reading from and writing to a file system (FS).
- FSUtil() - Constructor for class org.apache.tika.batch.fs.FSUtil
- FSUtil.HANDLE_EXISTING - Enum in org.apache.tika.batch.fs
G
- getAdded() - Method in class org.apache.tika.batch.FileResourceCrawler
- getAdded() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
- getBoolean(String, Boolean) - Static method in class org.apache.tika.util.PropsUtil
-
Parses v.
- getCauseForTermination() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
- getConsidered() - Method in class org.apache.tika.batch.FileResourceCrawler
- getConsidered() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
-
Returns the number of file resources considered.
- getConsumed() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
- getConsumers() - Method in class org.apache.tika.batch.ConsumersManager
-
Get the consumers
- getConsumersManagerMaxMillis() - Method in class org.apache.tika.batch.ConsumersManager
-
BatchProcesswill throw an exception if the ConsumersManager doesn't complete init() or shutdown() within this amount of time. - getCurrentFile() - Method in class org.apache.tika.batch.FileResourceConsumer
-
Returns the name and start time of a file that is currently being processed.
- getDefaultNumConsumers() - Static method in class org.apache.tika.batch.builders.AbstractConsumersBuilder
- getExitStatus() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
- getFile(String, File) - Static method in class org.apache.tika.util.PropsUtil
-
Deprecated.
- getInputStream(FileResource) - Method in class org.apache.tika.batch.fs.AbstractFSConsumer
- getInt(String, Integer) - Static method in class org.apache.tika.util.PropsUtil
-
Parses v.
- getInt(String, Map<String, String>, Node) - Static method in class org.apache.tika.util.XMLDOMUtil
-
Get an int value.
- getLong(String, Long) - Static method in class org.apache.tika.util.PropsUtil
-
Parses v.
- getLong(String, Map<String, String>, Node) - Static method in class org.apache.tika.util.XMLDOMUtil
-
Get a long value.
- getMetadata() - Method in interface org.apache.tika.batch.FileResource
-
This gets the metadata available before the parsing of the file.
- getMetadata() - Method in class org.apache.tika.batch.fs.FSFileResource
- getNumberHandledExceptions() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
- getNumConsumers(Map<String, String>) - Static method in class org.apache.tika.batch.builders.BatchProcessBuilder
-
numConsumers is needed by both the crawler and the consumers.
- getNumHandledExceptions() - Method in class org.apache.tika.batch.FileResourceConsumer
- getNumResourcesConsumed() - Method in class org.apache.tika.batch.FileResourceConsumer
- getNumRestarts() - Method in class org.apache.tika.batch.BatchProcessDriverCLI
- getOutputEncoding() - Method in class org.apache.tika.batch.fs.BasicTikaFSConsumer
- getOutputEncoding() - Method in class org.apache.tika.batch.fs.RecursiveParserWrapperFSConsumer
- getOutputEncoding() - Method in class org.apache.tika.batch.fs.StreamOutRPWFSConsumer
- getOutputFile(File, String, FSUtil.HANDLE_EXISTING, String) - Static method in class org.apache.tika.batch.fs.FSUtil
-
Deprecated.
- getOutputPath(Path, String, FSUtil.HANDLE_EXISTING, String) - Static method in class org.apache.tika.batch.fs.FSUtil
-
Given an output root and an initial relative path, return the output file according to the HANDLE_EXISTING strategy
- getOutputStream(OutputStreamFactory, FileResource) - Method in class org.apache.tika.batch.fs.AbstractFSConsumer
-
Use this for consistent logging of exceptions.
- getOutputStream(Metadata) - Method in class org.apache.tika.batch.fs.FSOutputStreamFactory
-
This tries to create a file based on the
FSUtil.HANDLE_EXISTINGvalue that was passed in during initialization. - getOutputStream(Metadata) - Method in interface org.apache.tika.batch.OutputStreamFactory
- getParser(TikaConfig) - Method in class org.apache.tika.batch.AutoDetectParserFactory
- getParser(TikaConfig) - Method in class org.apache.tika.batch.ParserFactory
- getPath(String, Path) - Static method in class org.apache.tika.util.PropsUtil
-
Parses v.
- getResourceId() - Method in interface org.apache.tika.batch.FileResource
-
This is only used in logging to identify which file may have caused problems.
- getResourceId() - Method in class org.apache.tika.batch.fs.FSFileResource
- getRoughCountExceptions() - Method in class org.apache.tika.batch.StatusReporter
-
This returns a rough (unsynchronized) count of caught/handled exceptions.
- getString(String, String) - Static method in class org.apache.tika.util.PropsUtil
-
Parses v.
- getXMLifiedLogMsg(String, String, String...) - Method in class org.apache.tika.batch.FileResourceConsumer
- getXMLifiedLogMsg(String, String, Throwable, String...) - Method in class org.apache.tika.batch.FileResourceConsumer
-
Use this for structured output that captures resourceId and other attributes.
- GZIP - Enum constant in enum org.apache.tika.batch.fs.FSOutputStreamFactory.COMPRESSION
H
- handleFirstFileInDirectory(Path) - Method in class org.apache.tika.batch.fs.FSDirectoryCrawler
-
Override this if you have any special handling for the first actual file that the crawler comes across in a directory.
I
- IContentHandlerFactoryBuilder - Interface in org.apache.tika.batch.builders
- ICrawlerBuilder - Interface in org.apache.tika.batch.builders
- IFileProcessorFutureResult - Interface in org.apache.tika.batch
-
stub interface to allow for different result types from different processors
- incrementHandledExceptions() - Method in class org.apache.tika.batch.FileResourceConsumer
-
Make sure to call this appropriately!
- init() - Method in class org.apache.tika.batch.ConsumersManager
-
This is called by BatchProcess before submitting the threads
- init() - Method in class org.apache.tika.batch.fs.FSConsumersManager
- Interrupter - Class in org.apache.tika.batch
-
Class that waits for input on System.in.
- Interrupter(long) - Constructor for class org.apache.tika.batch.Interrupter
- InterrupterBuilder - Class in org.apache.tika.batch.builders
-
Builds an Interrupter
- InterrupterBuilder() - Constructor for class org.apache.tika.batch.builders.InterrupterBuilder
- InterrupterFutureResult - Class in org.apache.tika.batch
- InterrupterFutureResult() - Constructor for class org.apache.tika.batch.InterrupterFutureResult
- IO_IS - Static variable in class org.apache.tika.batch.FileResourceConsumer
- IO_OS - Static variable in class org.apache.tika.batch.FileResourceConsumer
- IParserFactoryBuilder - Interface in org.apache.tika.batch.builders
- isActive() - Method in class org.apache.tika.batch.FileResourceCrawler
-
If the crawler stops for any reason, it is no longer active.
- isParseRecursively() - Method in class org.apache.tika.batch.ParserFactory
- isQueueEmpty() - Method in class org.apache.tika.batch.FileResourceCrawler
-
Use sparingly.
- isStillActive() - Method in class org.apache.tika.batch.FileResourceConsumer
-
Returns whether or not the consumer is still could process a file or is still processing a file (ACTIVELY_CONSUMING or ASKED_TO_SHUTDOWN)
- isUserInterrupted() - Method in class org.apache.tika.batch.BatchProcessDriverCLI
L
- LOG - Static variable in class org.apache.tika.batch.FileResourceConsumer
- LOG - Static variable in class org.apache.tika.batch.FileResourceCrawler
M
- main(String[]) - Static method in class org.apache.tika.batch.BatchProcessDriverCLI
- main(String[]) - Static method in class org.apache.tika.batch.fs.FSBatchProcessCLI
- main(String[]) - Static method in class org.apache.tika.batch.fs.strawman.StrawManTikaAppDriver
- mapifyAttrs(Node, Map<String, String>) - Static method in class org.apache.tika.util.XMLDOMUtil
-
This grabs the attributes from a dom node and overwrites those values with those specified by the overwrite map.
- MAX_QUEUE_SIZE_KEY - Static variable in class org.apache.tika.batch.builders.BatchProcessBuilder
N
- NONE - Enum constant in enum org.apache.tika.batch.fs.FSOutputStreamFactory.COMPRESSION
- NUM_CONSUMERS_KEY - Static variable in class org.apache.tika.batch.builders.BatchProcessBuilder
O
- ObjectFromDOMAndQueueBuilder<T> - Interface in org.apache.tika.batch.builders
-
Same as
ObjectFromDOMAndQueueBuilder, but this is for objects that require access to the shared queue. - ObjectFromDOMBuilder<T> - Interface in org.apache.tika.batch.builders
-
Interface for things that build objects from a DOM Node and a map of runtime attributes
- OOM - Static variable in class org.apache.tika.batch.FileResourceConsumer
- openInputStream() - Method in interface org.apache.tika.batch.FileResource
- openInputStream() - Method in class org.apache.tika.batch.fs.FSFileResource
- org.apache.tika.batch - package org.apache.tika.batch
- org.apache.tika.batch.builders - package org.apache.tika.batch.builders
- org.apache.tika.batch.fs - package org.apache.tika.batch.fs
- org.apache.tika.batch.fs.builders - package org.apache.tika.batch.fs.builders
- org.apache.tika.batch.fs.strawman - package org.apache.tika.batch.fs.strawman
- org.apache.tika.util - package org.apache.tika.util
- OS_ORDER - Enum constant in enum org.apache.tika.batch.fs.FSDirectoryCrawler.CRAWL_ORDER
- OutputStreamFactory - Interface in org.apache.tika.batch
- OVERWRITE - Enum constant in enum org.apache.tika.batch.fs.FSUtil.HANDLE_EXISTING
P
- ParallelFileProcessingResult - Class in org.apache.tika.batch
- ParallelFileProcessingResult(int, int, int, int, double, int, String) - Constructor for class org.apache.tika.batch.ParallelFileProcessingResult
- parse(String, Parser, InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.batch.FileResourceConsumer
-
Utility method to handle logging equivalently among all implementing classes.
- PARSE_ERR - Static variable in class org.apache.tika.batch.FileResourceConsumer
- PARSE_EX - Static variable in class org.apache.tika.batch.FileResourceConsumer
- ParserFactory - Class in org.apache.tika.batch
- ParserFactory() - Constructor for class org.apache.tika.batch.ParserFactory
- ParserFactoryBuilder - Class in org.apache.tika.batch.builders
- ParserFactoryBuilder() - Constructor for class org.apache.tika.batch.builders.ParserFactoryBuilder
- pleaseShutdown() - Method in class org.apache.tika.batch.FileResourceConsumer
-
This politely asks the consumer to shutdown.
- PROCESS_COMPLETED_SUCCESSFULLY - Static variable in class org.apache.tika.batch.BatchProcessDriverCLI
- PROCESS_NO_RESTART_EXIT_CODE - Static variable in class org.apache.tika.batch.BatchProcessDriverCLI
- PROCESS_RESTART_EXIT_CODE - Static variable in class org.apache.tika.batch.BatchProcessDriverCLI
-
This relies on an special exit values of 254 (do not restart), 0 ended correctly, 253 ended with exception (do restart)
- processFileResource(FileResource) - Method in class org.apache.tika.batch.FileResourceConsumer
-
Main piece of code that needs to be implemented.
- processFileResource(FileResource) - Method in class org.apache.tika.batch.fs.BasicTikaFSConsumer
- processFileResource(FileResource) - Method in class org.apache.tika.batch.fs.RecursiveParserWrapperFSConsumer
- processFileResource(FileResource) - Method in class org.apache.tika.batch.fs.StreamOutRPWFSConsumer
- PropsUtil - Class in org.apache.tika.util
-
Utility class to handle properties.
- PropsUtil() - Constructor for class org.apache.tika.util.PropsUtil
R
- RANDOM - Enum constant in enum org.apache.tika.batch.fs.FSDirectoryCrawler.CRAWL_ORDER
- RecursiveParserWrapperFSConsumer - Class in org.apache.tika.batch.fs
-
This runs a RecursiveParserWrapper against an input file and outputs the json metadata to an output file.
- RecursiveParserWrapperFSConsumer(ArrayBlockingQueue<FileResource>, Parser, ContentHandlerFactory, OutputStreamFactory, MetadataFilter) - Constructor for class org.apache.tika.batch.fs.RecursiveParserWrapperFSConsumer
- RENAME - Enum constant in enum org.apache.tika.batch.fs.FSUtil.HANDLE_EXISTING
- report(String) - Method in class org.apache.tika.batch.StatusReporter
-
Override for different behavior.
- ReporterBuilder - Interface in org.apache.tika.batch.builders
-
Interface for reporter builders
- resolveRelative(Path, String) - Static method in class org.apache.tika.batch.fs.FSUtil
-
Convenience method to ensure that "other" is not an absolute path.
S
- secondsElapsed() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
- select(Metadata) - Method in class org.apache.tika.batch.FileResourceCrawler
- select(Metadata) - Method in class org.apache.tika.batch.fs.FSDocumentSelector
- setConsumersManagerMaxMillis(long) - Method in class org.apache.tika.batch.ConsumersManager
- setDocumentSelector(DocumentSelector) - Method in class org.apache.tika.batch.FileResourceCrawler
- setIsShuttingDown(boolean) - Method in class org.apache.tika.batch.StatusReporter
-
Set whether the main process is in the process of shutting down.
- setMaxAliveTimeSeconds(int) - Method in class org.apache.tika.batch.BatchProcess
-
The maximum amount of time that this process can be alive.
- setMaxConsecWaitInMillis(long) - Method in class org.apache.tika.batch.FileResourceCrawler
- setMaxFilesToAdd(int) - Method in class org.apache.tika.batch.FileResourceCrawler
-
Maximum number of files to add.
- setMaxFilesToConsider(int) - Method in class org.apache.tika.batch.FileResourceCrawler
-
Maximum number of files to consider.
- setOutputEncoding(String) - Method in class org.apache.tika.batch.fs.RecursiveParserWrapperFSConsumer
- setOutputEncoding(String) - Method in class org.apache.tika.batch.fs.StreamOutRPWFSConsumer
- setOutputEncoding(Charset) - Method in class org.apache.tika.batch.fs.BasicTikaFSConsumer
- setParseRecursively(boolean) - Method in class org.apache.tika.batch.ParserFactory
- setPauseOnEarlyTerminationMillis(long) - Method in class org.apache.tika.batch.BatchProcess
-
If there is an early termination via an interrupt or too many timed out consumers or because a consumer or other Runnable threw a Throwable, pause this long before interrupting the consumers and other threads.
- setRedirectForkedProcessToStdOut(boolean) - Method in class org.apache.tika.batch.BatchProcessDriverCLI
-
Typically only used for testing.
- setSleepMillis(long) - Method in class org.apache.tika.batch.StatusReporter
-
Set the amount of time to sleep between reports.
- setStaleThresholdMillis(long) - Method in class org.apache.tika.batch.StatusReporter
-
Set the amount of time in milliseconds to use as the threshold for determining a stale parse.
- setTimeoutCheckPulseMillis(long) - Method in class org.apache.tika.batch.BatchProcess
- setTimeoutThresholdMillis(long) - Method in class org.apache.tika.batch.BatchProcess
-
The amount of time allowed before a consumer should be timed out.
- shutdown() - Method in class org.apache.tika.batch.ConsumersManager
-
This is called by BatchProcess immediately before closing.
- shutdown() - Method in class org.apache.tika.batch.fs.FSConsumersManager
- shutDownNoPoison() - Method in class org.apache.tika.batch.FileResourceCrawler
-
Set to true to shut down the FileResourceCrawler without adding poison.
- SimpleLogReporterBuilder - Class in org.apache.tika.batch.builders
- SimpleLogReporterBuilder() - Constructor for class org.apache.tika.batch.builders.SimpleLogReporterBuilder
- SKIP - Enum constant in enum org.apache.tika.batch.fs.FSUtil.HANDLE_EXISTING
- SKIPPED - Static variable in class org.apache.tika.batch.FileResourceCrawler
- SORTED - Enum constant in enum org.apache.tika.batch.fs.FSDirectoryCrawler.CRAWL_ORDER
- start() - Method in class org.apache.tika.batch.FileResourceCrawler
-
Implement this to control the addition of FileResources.
- start() - Method in class org.apache.tika.batch.fs.FSDirectoryCrawler
- start() - Method in class org.apache.tika.batch.fs.FSListCrawler
- StatusReporter - Class in org.apache.tika.batch
-
Basic class to use for reporting status from both the crawler and the consumers.
- StatusReporter(FileResourceCrawler, ConsumersManager) - Constructor for class org.apache.tika.batch.StatusReporter
-
Initialize with the crawler and consumers
- StatusReporterBuilder - Interface in org.apache.tika.batch.builders
- StatusReporterFutureResult - Class in org.apache.tika.batch
-
Empty class for what a StatusReporter returns when it finishes.
- StatusReporterFutureResult() - Constructor for class org.apache.tika.batch.StatusReporterFutureResult
- STOP_NOW - Static variable in class org.apache.tika.batch.FileResourceCrawler
- StrawManTikaAppDriver - Class in org.apache.tika.batch.fs.strawman
-
Simple single-threaded class that calls tika-app against every file in a directory.
- StrawManTikaAppDriver(Path, Path, int, Path, String[]) - Constructor for class org.apache.tika.batch.fs.strawman.StrawManTikaAppDriver
- StreamOutRPWFSConsumer - Class in org.apache.tika.batch.fs
-
This uses the
JsonStreamingSerializerto write out a single metadata object at a time. - StreamOutRPWFSConsumer(ArrayBlockingQueue<FileResource>, Parser, ContentHandlerFactory, OutputStreamFactory, MetadataFilter) - Constructor for class org.apache.tika.batch.fs.StreamOutRPWFSConsumer
T
- TIMED_OUT - Static variable in class org.apache.tika.batch.FileResourceConsumer
- toString() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
- tryToAdd(FileResource) - Method in class org.apache.tika.batch.FileResourceCrawler
U
- usage() - Method in class org.apache.tika.batch.fs.FSBatchProcessCLI
- usage() - Static method in class org.apache.tika.batch.fs.strawman.StrawManTikaAppDriver
V
- valueOf(String) - Static method in enum org.apache.tika.batch.BatchProcess.BATCH_CONSTANTS
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.batch.fs.FSDirectoryCrawler.CRAWL_ORDER
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.batch.fs.FSOutputStreamFactory.COMPRESSION
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.batch.fs.FSUtil.HANDLE_EXISTING
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum org.apache.tika.batch.BatchProcess.BATCH_CONSTANTS
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.apache.tika.batch.fs.FSDirectoryCrawler.CRAWL_ORDER
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.apache.tika.batch.fs.FSOutputStreamFactory.COMPRESSION
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.apache.tika.batch.fs.FSUtil.HANDLE_EXISTING
-
Returns an array containing the constants of this enum type, in the order they are declared.
W
- wasTimedOut() - Method in class org.apache.tika.batch.FileResourceCrawler
-
Returns whether the crawler timed out while trying to add a resource to the queue.
X
- XMLDOMUtil - Class in org.apache.tika.util
- XMLDOMUtil() - Constructor for class org.apache.tika.util.XMLDOMUtil
Z
- ZIP - Enum constant in enum org.apache.tika.batch.fs.FSOutputStreamFactory.COMPRESSION
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form
BasicTikaFSConsumer(ArrayBlockingQueue, Parser, ContentHandlerFactory, OutputStreamFactory)