- canRun() - Static method in class org.apache.tika.parser.journal.GrobidRESTParser
-
- CaptionObject - Class in org.apache.tika.parser.captioning
-
A model for caption objects from graphics and texts typically includes
human readable sentence, language of the sentence and confidence score.
- CaptionObject(String, String, double) - Constructor for class org.apache.tika.parser.captioning.CaptionObject
-
- Cell - Interface in org.apache.tika.parser.microsoft
-
Cell of content.
- cell(String, String, XSSFComment) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
-
- CellDecorator - Class in org.apache.tika.parser.microsoft
-
Cell decorator.
- CellDecorator(Cell) - Constructor for class org.apache.tika.parser.microsoft.CellDecorator
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- CharsetDetector - Class in org.apache.tika.parser.txt
-
CharsetDetector provides a facility for detecting the
charset or encoding of character data in an unknown format.
- CharsetDetector() - Constructor for class org.apache.tika.parser.txt.CharsetDetector
-
Constructor
- CharsetDetector(int) - Constructor for class org.apache.tika.parser.txt.CharsetDetector
-
- CharsetMatch - Class in org.apache.tika.parser.txt
-
This class represents a charset that has been identified by a CharsetDetector
as a possible encoding for a set of input data.
- check(Metadata) - Method in class org.apache.tika.parser.pdf.AccessChecker
-
- checkAvail() - Method in class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
Ping lucene-geo-gazetteer API
- checkBit(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
- CHM_ITSF_V2_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_ITSF_V3_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_ITSP_V1_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_LZXC_MIN_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_LZXC_RESETTABLE_V1_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_LZXC_V2_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_PMGI_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_PMGI_MARKER - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_PMGL_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_SIGNATURE_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_VER_1 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_VER_2 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_VER_3 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_WINDOW_SIZE_BLOCK - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- ChmAccessor<T> - Interface in org.apache.tika.parser.chm.accessor
-
Defines an accessor interface
- ChmAssert - Class in org.apache.tika.parser.chm.assertion
-
Contains chm extractor assertions
- ChmAssert() - Constructor for class org.apache.tika.parser.chm.assertion.ChmAssert
-
- ChmBlockInfo - Class in org.apache.tika.parser.chm.lzx
-
A container that contains chm block information such as: i.
- ChmCommons - Class in org.apache.tika.parser.chm.core
-
- ChmCommons.EntryType - Enum in org.apache.tika.parser.chm.core
-
Represents entry types: uncompressed, compressed
- ChmCommons.IntelState - Enum in org.apache.tika.parser.chm.core
-
Represents intel file states during decompression
- ChmCommons.LzxState - Enum in org.apache.tika.parser.chm.core
-
Represents lzx states: started decoding, not started decoding
- ChmConstants - Class in org.apache.tika.parser.chm.core
-
- ChmDirectoryListingSet - Class in org.apache.tika.parser.chm.accessor
-
Holds chm listing entries
- ChmDirectoryListingSet(byte[], ChmItsfHeader, ChmItspHeader) - Constructor for class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Constructs chm directory listing set
- ChmExtractor - Class in org.apache.tika.parser.chm.core
-
Extracts text from chm file.
- ChmExtractor(InputStream) - Constructor for class org.apache.tika.parser.chm.core.ChmExtractor
-
- ChmItsfHeader - Class in org.apache.tika.parser.chm.accessor
-
The Header 0000: char[4] 'ITSF' 0004: DWORD 3 (Version number) 0008: DWORD
Total header length, including header section table and following data.
- ChmItsfHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
- ChmItspHeader - Class in org.apache.tika.parser.chm.accessor
-
Directory header The directory starts with a header; its format is as
follows: 0000: char[4] 'ITSP' 0004: DWORD Version number 1 0008: DWORD Length
of the directory header 000C: DWORD $0a (unknown) 0010: DWORD $1000 Directory
chunk size 0014: DWORD "Density" of quickref section, usually 2 0018: DWORD
Depth of the index tree - 1 there is no index, 2 if there is one level of
PMGI chunks 001C: DWORD Chunk number of root index chunk, -1 if there is none
(though at least one file has 0 despite there being no index chunk, probably
a bug) 0020: DWORD Chunk number of first PMGL (listing) chunk 0024: DWORD
Chunk number of last PMGL (listing) chunk 0028: DWORD -1 (unknown) 002C:
DWORD Number of directory chunks (total) 0030: DWORD Windows language ID
0034: GUID {5D02926A-212E-11D0-9DF9-00A0C922E6EC} 0044: DWORD $54 (This is
the length again) 0048: DWORD -1 (unknown) 004C: DWORD -1 (unknown) 0050:
DWORD -1 (unknown)
- ChmItspHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
- ChmLzxBlock - Class in org.apache.tika.parser.chm.lzx
-
Decompresses a chm block.
- ChmLzxBlock(int, byte[], long, ChmLzxBlock) - Constructor for class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- ChmLzxcControlData - Class in org.apache.tika.parser.chm.accessor
-
::DataSpace/Storage//ControlData This file contains $20 bytes of
information on the compression.
- ChmLzxcControlData() - Constructor for class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
- ChmLzxcResetTable - Class in org.apache.tika.parser.chm.accessor
-
LZXC reset table For ensuring a decompression.
- ChmLzxcResetTable() - Constructor for class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
- ChmLzxState - Class in org.apache.tika.parser.chm.lzx
-
- ChmLzxState(int) - Constructor for class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- ChmParser - Class in org.apache.tika.parser.chm
-
- ChmParser() - Constructor for class org.apache.tika.parser.chm.ChmParser
-
- ChmParsingException - Exception in org.apache.tika.parser.chm.exception
-
- ChmParsingException(String) - Constructor for exception org.apache.tika.parser.chm.exception.ChmParsingException
-
- ChmPmgiHeader - Class in org.apache.tika.parser.chm.accessor
-
Description Note: not always exists An index chunk has the following format:
0000: char[4] 'PMGI' 0004: DWORD Length of quickref/free area at end of
directory chunk 0008: Directory index entries (to quickref/free area) The
quickref area in an PMGI is the same as in an PMGL The format of a directory
index entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded)
ENCINT: directory listing chunk which starts with name Encoded Integers aka
ENCINT An ENCINT is a variable-length integer.
- ChmPmgiHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
- ChmPmglHeader - Class in org.apache.tika.parser.chm.accessor
-
Description There are two types of directory chunks -- index chunks, and
listing chunks.
- ChmPmglHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- ChmSection - Class in org.apache.tika.parser.chm.lzx
-
- ChmSection(byte[]) - Constructor for class org.apache.tika.parser.chm.lzx.ChmSection
-
- ChmSection(byte[], byte[]) - Constructor for class org.apache.tika.parser.chm.lzx.ChmSection
-
- ChmWrapper - Class in org.apache.tika.parser.chm.core
-
- ChmWrapper() - Constructor for class org.apache.tika.parser.chm.core.ChmWrapper
-
- ClassParser - Class in org.apache.tika.parser.asm
-
Parser for Java .class files.
- ClassParser() - Constructor for class org.apache.tika.parser.asm.ClassParser
-
- clone() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- close() - Method in class org.apache.tika.parser.mp4.DirectFileReadDataSource
-
- closeStyleTags(XHTMLContentHandler, Deque<FormattingUtils.Tag>) - Static method in class org.apache.tika.parser.microsoft.FormattingUtils
-
Closes all formatting tags.
- CommonsDigester - Class in org.apache.tika.parser.utils
-
Implementation of DigestingParser.Digester
that relies on commons.codec.digest.DigestUtils to calculate digest hashes.
- CommonsDigester(int, String) - Constructor for class org.apache.tika.parser.utils.CommonsDigester
-
Include a string representing the comma-separated algorithms to run: e.g.
- CommonsDigester(int, CommonsDigester.DigestAlgorithm...) - Constructor for class org.apache.tika.parser.utils.CommonsDigester
-
- CommonsDigester.DigestAlgorithm - Enum in org.apache.tika.parser.utils
-
- COMP_OBJ - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Some other kind of embedded document, in a CompObj container within another OLE2 document
- compareTo(CSVResult) - Method in class org.apache.tika.parser.csv.CSVResult
-
Sorts in descending order of confidence
- compareTo(CharsetMatch) - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Compare to other CharsetMatch objects.
- CompositeTagHandler - Class in org.apache.tika.parser.mp3
-
Takes an array of
ID3Tags in preference order, and when asked for
a given tag, will return it from the first
ID3Tags that has it.
- CompositeTagHandler(ID3Tags[]) - Constructor for class org.apache.tika.parser.mp3.CompositeTagHandler
-
- CompressorParser - Class in org.apache.tika.parser.pkg
-
Parser for various compression formats.
- CompressorParser() - Constructor for class org.apache.tika.parser.pkg.CompressorParser
-
- CompressorParserOptions - Interface in org.apache.tika.parser.pkg
-
Interface for setting options for the
CompressorParser by passing
via the
ParseContext.
- confidence - Variable in class org.apache.tika.parser.recognition.RecognisedObject
-
Confidence score
- config - Variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- configure(ParseContext) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- configure(PDF2XHTML) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Configures the given pdf2XHTML.
- configureExtractor(POIXMLTextExtractor, Locale) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
-
- configureExtractor(POIXMLTextExtractor, Locale) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
-
- contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
-
- containsEmail(String) - Static method in class org.apache.tika.parser.mail.MailUtil
-
If the chunk looks like it contains an email
- CONTENT - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CONTROL_DATA - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- converttoInt(byte[]) - Static method in class org.apache.tika.parser.image.ICNSType
-
- convertToJSONArray(JSONObject, String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
Converts JSON Object to JSON Array
- convertToJSONObject(String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
Parses a JSON String and converts it to a JSON Object
- copyOfRange(byte[], int, int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
- CoreNLPNERecogniser - Class in org.apache.tika.parser.ner.corenlp
-
This class offers an implementation of
NERecogniser based on
CRF classifiers from Stanford CoreNLP.
- CoreNLPNERecogniser() - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
- CoreNLPNERecogniser(String) - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
Creates a NERecogniser by loading model from given path
- createDecryptStream(InputStream, Key) - Method in class org.apache.tika.parser.hwp.HwpTextExtractorV5
-
- createFrameIfPresent(InputStream) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Returns the next ID3v2 Frame in
the file, or null if the next batch of data
doesn't correspond to either an ID3v2 header.
- createOneNoteDocumentFromDirectFileResource(OneNoteDirectFileResource) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
-
Create a OneNoteDocument object.
- CSVParams - Class in org.apache.tika.parser.csv
-
- CSVResult - Class in org.apache.tika.parser.csv
-
- CSVResult(double, MediaType, Character) - Constructor for class org.apache.tika.parser.csv.CSVResult
-
- CTAKES_META_PREFIX - Static variable in class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
- CTAKESAnnotationProperty - Enum in org.apache.tika.parser.ctakes
-
This enumeration includes the properties that an IdentifiedAnnotation object can provide.
- CTAKESConfig - Class in org.apache.tika.parser.ctakes
-
- CTAKESConfig() - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
-
Default constructor.
- CTAKESConfig(InputStream) - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
-
Loads properties from InputStream and then tries to close InputStream.
- CTAKESContentHandler - Class in org.apache.tika.parser.ctakes
-
Class used to extract biomedical information while parsing.
- CTAKESContentHandler(ContentHandler, Metadata, CTAKESConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
- CTAKESContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
- CTAKESContentHandler() - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
Default constructor.
- CTAKESParser - Class in org.apache.tika.parser.ctakes
-
CTAKESParser decorates a
Parser and leverages on
CTAKESContentHandler to extract biomedical information from
clinical text using Apache cTAKES.
- CTAKESParser() - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
-
Wraps the default Parser
- CTAKESParser(TikaConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
-
Wraps the default Parser for this Config
- CTAKESParser(Parser) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
-
Wraps the specified Parser
- CTAKESSerializer - Enum in org.apache.tika.parser.ctakes
-
Enumeration for types of cTAKES (UIMA) CAS serializer supported by cTAKES.
- CTAKESUtils - Class in org.apache.tika.parser.ctakes
-
This class provides methods to extract biomedical information from plain text
using
CTAKESContentHandler that relies on Apache cTAKES.
- CTAKESUtils() - Constructor for class org.apache.tika.parser.ctakes.CTAKESUtils
-
- GDALParser - Class in org.apache.tika.parser.gdal
-
- GDALParser() - Constructor for class org.apache.tika.parser.gdal.GDALParser
-
- GENERAL_EMBEDDED - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
General embedded document type within an OLE2 container
- GENRES - Static variable in interface org.apache.tika.parser.mp3.ID3Tags
-
List of predefined genres.
- GeoGazetteerClient - Class in org.apache.tika.parser.geo.topic.gazetteer
-
- GeoGazetteerClient(String) - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
Pass URL on which lucene-geo-gazetteer is available - eg.
- GeoGazetteerClient(GeoParserConfig) - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
- GeographicInformationParser - Class in org.apache.tika.parser.geoinfo
-
- GeographicInformationParser() - Constructor for class org.apache.tika.parser.geoinfo.GeographicInformationParser
-
- geoInfoType - Static variable in class org.apache.tika.parser.geoinfo.GeographicInformationParser
-
- GeoParser - Class in org.apache.tika.parser.geo.topic
-
- GeoParser() - Constructor for class org.apache.tika.parser.geo.topic.GeoParser
-
- GeoParserConfig - Class in org.apache.tika.parser.geo.topic
-
- GeoParserConfig() - Constructor for class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- GeoTag - Class in org.apache.tika.parser.geo.topic
-
- GeoTag() - Constructor for class org.apache.tika.parser.geo.topic.GeoTag
-
- get() - Method in enum org.apache.tika.parser.strings.StringsEncoding
-
- get7BitsInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
AKA a Synchsafe integer.
- getAccessChecker() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getAdmin1Code() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getAdmin2Code() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getAeDescriptorPath() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the path to XML descriptor for AnalysisEngine.
- getAlbum() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getAlbum() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getAlbumArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The Artist for the overall album / compilation of albums
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have album-wide artists,
so returns null;
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getAlignedLenTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getAlignedTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getAllDetectableCharsets() - Static method in class org.apache.tika.parser.txt.CharsetDetector
-
Get the names of all charsets supported by CharsetDetector class.
- getAllNameEntitiesfromInput(InputStream) - Method in class org.apache.tika.parser.geo.topic.NameEntityExtractor
-
- getAllTagHandlers(InputStream, ContentHandler) - Static method in class org.apache.tika.parser.mp3.Mp3Parser
-
Scans the MP3 frames for ID3 tags, and creates ID3Tag Handlers
for each supported set of tags.
- getAnalysisEngine(String, String, String) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Returns a new UIMA Analysis Engine (AE).
- getAnnotationProperty(IdentifiedAnnotation, CTAKESAnnotationProperty) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Returns the annotation value based on the given annotation type.
- getAnnotationProps() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- getAnnotationPropsAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns a string containing a comma-separated list of
CTAKESAnnotationProperty names that will be included into cTAKES metadata.
- getApiUri(Metadata) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- getApiUri(Metadata) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- getApiUri(Metadata) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTVideoRecogniser
-
- getApplyRotation() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The Artist for the track
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getAverageCharTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getBestNameEntity() - Method in class org.apache.tika.parser.geo.topic.NameEntityExtractor
-
- getBigInteger(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getBitRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the bit rate in bit per second.
- getBitsPerPixel() - Method in class org.apache.tika.parser.image.ICNSType
-
- getBlock_len() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns block's length
- getBlockAddress() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Returns block addresses
- getBlockCount() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets a block count
- getBlockidx_intvl() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns block index interval
- getBlockLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets a block length
- getBlockLength() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getBlockNext() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getBlockNumber() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getBlockPrev() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getBlockRemaining() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getBlockType() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getByte() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getCatchIntermediateIOExceptions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getCenter() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- getChannels() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the number of channels (1=mono, 2=stereo)
- getCharset() - Method in class org.apache.tika.parser.csv.CSVParams
-
- getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Deprecated.
- getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData, ChmBlockInfo) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
- getChmBlockSegment(byte[], ChmLzxcResetTable, int, int, int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
- getChmDirList() - Method in class org.apache.tika.parser.chm.core.ChmExtractor
-
- getChmDirList() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChmItsfHeader() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChmItspHeader() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChmLzxcControlData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChmLzxcResetTable() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getClassName() - Method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
-
- getColorspace() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getCommand() - Method in class org.apache.tika.parser.gdal.GDALParser
-
- getComment(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Builds up the ID3 comment, by parsing and extracting
the comment string parts from the given data.
- getComments() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getComments() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
Retrieves the comments, if any.
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getCompilation() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getCompilation() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have compilations,
so returns null;
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
ID3v22 doesn't have compilations,
so returns null;
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getComposer() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getComposer() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have composers,
so returns null;
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getCompressedLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets compressed length
- getConcatenatePhoneticRuns() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getConfidence() - Method in class org.apache.tika.parser.csv.CSVResult
-
- getConfidence() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- getConfidence() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Get an indication of the confidence in the charset detected.
- getContent() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getContent(int, int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getContent(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentMetaParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.DcXMLParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
-
- getContentLength() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getContentParser() - Method in class org.apache.tika.parser.epub.EpubParser
-
- getContentParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- getControlDataIndex() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Returns control data index that located in List
- getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getCountryCode() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getData() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getData() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getDataOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Returns data offset
- getDataOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns data offset
- getDateFormatOverride() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getDecorationName() - Method in class org.apache.tika.parser.ctakes.CTAKESParser
-
- getDefaultConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- getDelimiter() - Method in class org.apache.tika.parser.csv.CSVParams
-
- getDelimiter() - Method in class org.apache.tika.parser.csv.CSVResult
-
- getDensity() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getDepth() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getDescription() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Gets the description, if present
- getDetectableCharsets() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
- getDetectAngles() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getDir_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns directory uuid
- getDirectoryListingEntryList() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Returns chm directory listing entry list
- getDirLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns directory length
- getDirOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns directory offset
- getDisc() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getDisc() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The number of the disc this belongs to, within the set
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have disc numbers,
so returns null;
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- getDocument() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
-
Returns the opened document.
- getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
-
- getDuration() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Returns the duration in milliseconds.
- getEnableAutoSpace() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getEnableAutoSpace() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getEncint() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getEncoding() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the character encoding of the strings that are to be found.
- getEndBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns the end block index
- getEndOffset() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns the end offset index
- getEntityTypes() - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in interface org.apache.tika.parser.ner.NERecogniser
-
gets a set of entity types whose names are recognisable by this
- getEntityTypes() - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
-
- getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- getEntityTypes() - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
- getEntriesToCopy() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- getEntryType() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Returns ChmCommons.EntryType (COMPRESSED or UNCOMPRESSED)
- getExtendedHeader() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getExtension() - Method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
- getExtractAcroFormContent() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractActions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractAllAlternativesFromMSG() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getExtractAllAlternativesFromMSG() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getExtractAnnotationText() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getExtractAnnotationText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractBookmarksText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractFontNames() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractInlineImages() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractMacros() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getExtractMacros() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getExtractMarkedContent() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractScripts() - Method in class org.apache.tika.parser.html.HtmlParser
-
- getExtractUniqueInlineImagesOnly() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getFilePath() - Method in class org.apache.tika.parser.strings.FileConfig
-
Returns the "file" installation folder.
- getFileProg() - Static method in class org.apache.tika.parser.strings.StringsParser
-
- getFilter() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getFlags() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getFormattedNumber(Paragraph) - Method in class org.apache.tika.parser.microsoft.ListManager
-
Get the formatted number for a given paragraph
- getFormattedNumber(XWPFParagraph) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
-
- getFormattedNumber(BigInteger, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
-
- getFramesRead() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getFreeSpace() - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Returns pmgi free space
- getFreeSpace() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getGazetteerRestEndpoint() - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- getGenre() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getGenre() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
-
- getHadStarted() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getHeader_len() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns header length
- getHeaderLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns itsf header length
- getHeight() - Method in class org.apache.tika.parser.image.ICNSType
-
- getId() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- getIfXFAExtractOnlyXFA() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getIlvl() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- getImageMagickPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getIncludeDeletedContent() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getIncludeDeletedContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeDeletedText() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- getIncludeDeletedText() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- getIncludeHeadersAndFooters() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeMissingRows() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeMoveFromContent() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getIncludeMoveFromContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeMoveFromText() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- getIncludeMoveFromText() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- getIncludeShapeBasedContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeSlideMasterContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeSlideNotes() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIndex() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
-
- getIndex_depth() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns an index depth
- getIndex_head() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns an index head
- getIndex_root() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns index root
- getIndexCopyFromStart() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- getIndexCopyToStart() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- getIndexOfContent() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getIndexOfResetData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getIndexOfResetTable() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getIniBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns an initial block index
- getInputStream() - Method in class org.apache.tika.parser.utils.DataURIScheme
-
- getInstance() - Static method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
- getInt(byte[]) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getInt2(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getInt3(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getIntelCurrentPossition() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getIntelFileSize() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getIntelState() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getJCas(AnalysisEngine) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Returns a new JCas () appropriate for the given Analysis Engine.
- getJustFileName(String) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- getLabel() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- getLabelLang() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- getLang_id() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns language id
- getLangId() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns language ID
- getLanguage(long) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
Returns textual representation of LangID
- getLanguage() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Gets the language, if present
- getLanguage() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getLanguage() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Get the ISO code for the language of the detected charset.
- getLastModified() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns last modified date of the chm file
- getLatitude() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getLayer() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the audio layer code.
- getLeft() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getLeft() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- getLength() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- getLength() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Returns the frame length in bytes.
- getLength() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getLengthTreeLengtsTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getLengthTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getLinearizedDictionary(PDDocument) - Static method in class org.apache.tika.parser.pdf.PDFPreflightParser
-
Copied verbatim from PDFBox
According to the PDF Reference, A linearized PDF contain a dictionary as first object (linearized dictionary) and
only this one in the first section.
- getLocations(List<String>) - Method in class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
Calls API of lucene-geo-gazetteer to search location name in gazetteer.
- getLongitude() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getLzxBlockLength() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getLzxBlockOffset() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getLzxBlocksCache() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
Return a list of the main parts of the document, used
when searching for embedded resources.
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.POIXMLTextExtractorDecorator
-
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
-
In PowerPoint files, slides have things embedded in them,
and slide drawings which have the images
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
-
This returns all items that might contain embedded objects:
main document, headers, footers, comments, etc.
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
-
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator
-
In PowerPoint files, slides have things embedded in them,
and slide drawings which have the images
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
In Excel files, sheets have things embedded in them,
and sheet drawings which have the images
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator
-
Include main body and anything else that can
have an attachment/embedded object
- getMainTreeElements() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getMainTreeLengtsTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getMainTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getMajorVersion() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getMarkLimit() - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
-
- getMarkLimit() - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
-
- getMarkLimit() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
- getMarkLimit() - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
-
- getMaxBytesForEmbeddedObject() - Static method in class org.apache.tika.parser.rtf.RTFParser
-
Deprecated.
- getMaxFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getMaxMainMemoryBytes() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
The maximum amount of memory to use when loading a pdf into a PDDocument.
- getMaxXMPMMHistory() - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
-
- getMediaType() - Method in class org.apache.tika.parser.csv.CSVParams
-
- getMediaType() - Method in class org.apache.tika.parser.csv.CSVResult
-
- getMediaType() - Method in class org.apache.tika.parser.utils.DataURIScheme
-
- getMessageClass(String) - Static method in class org.apache.tika.parser.microsoft.OutlookExtractor
-
- getMetadata() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns an array of metadata whose values will be analyzed using cTAKES.
- getMetadata() - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
Returns metadata that includes cTAKES annotations.
- getMetadataAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns a string containing a comma-separated list of metadata whose values will be analyzed using cTAKES.
- getMetadataExtractor() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- getMetadataExtractor() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
-
POIXMLTextExtractor.getMetadataTextExtractor() not yet supported
for OOXML by POI.
- getMetaParser() - Method in class org.apache.tika.parser.epub.EpubParser
-
- getMetaParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- getMinFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getMinLength() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the minimum sequence length (characters) to print.
- getMinorVersion() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getMinSize() - Method in class org.apache.tika.parser.strings.Latin1StringsParser
-
Returns the minimum size of a character sequence to be extracted.
- getMSB() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
-
- getName() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Returns an entry name
- getName() - Method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
-
- getName() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
-
- getName() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getName() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Get the name of the detected charset.
- getNameLength() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Returns an entry name length
- getNamespace() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- getNerModelUrl() - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- getNum_blocks() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns number of blocks
- getNumberOfLevels() - Method in class org.apache.tika.parser.microsoft.AbstractListManager.ParagraphLevelCounter
-
- getNumId() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- getOcrDPI() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Dots per inch used to render the page image for OCR
- getOcrImageFormatName() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
String representation of the image format used to render
the page image for OCR (examples: png, tiff, jpeg)
- getOcrImageQuality() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image quality used to render the page image for OCR.
- getOcrImageScale() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getOcrImageType() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image type used to render the page image for OCR.
- getOcrStrategy() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getOffset() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- getOtherTesseractConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getOutputStream() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- getOutputType() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getPageSegMode() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getPageSeparator() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getPart() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- getPDDocument(InputStream, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getPDDocument(Path, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getPDDocument(InputStream, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFPreflightParser
-
- getPDDocument(Path, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFPreflightParser
-
- getPDFParserConfig() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getPreserveInterwordSpacing() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getPrevContent() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getR0() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getR1() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getR2() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getReader(InputStream, String) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Autodetect the charset of an inputStream, and return a Java Reader
to access the converted input data.
- getReader() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Create a java.io.Reader for reading the Unicode character data corresponding
to the original byte data supplied to the Charset detect operation.
- getResetInterval() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns reset interval
- getResetTableIndex() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Return index of reset table
- getResize() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getRight() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- getSampleRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the sampling rate, in Hz
- getSeparatorChar() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the separator character used for annotation properties.
- getSerializerType() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the type of cTAKES (UIMA) serializer used to write the CAS.
- getSetKCMS() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns a signature of itsf header
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns a signature of the header
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns a signature of control data block
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Returns pmgi signature if exists
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getSize() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns a size of control data
- getSize() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
-
- getSortByPosition() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getSortByPosition() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getSpacingTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getStartBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns the start block index
- getStartIndex() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getStartOffset() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns the start offset index
- getState() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getStream_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns stream uuid
- getString(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Returns the String at the given
offset and length.
- getString(byte[], String) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Autodetect the charset of an inputStream, and return a String
containing the converted input data.
- getString() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Create a Java String from Unicode character data corresponding
to the original byte data supplied to the Charset detect operation.
- getString(int) - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Create a Java String from Unicode character data corresponding
to the original byte data supplied to the Charset detect operation.
- getStringsPath() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the "strings" installation folder.
- getStringsProg() - Static method in class org.apache.tika.parser.strings.StringsParser
-
- getStripMarkup() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
- getStyleClass() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
-
- getStyleID() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- getStyleName(String) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFStylesShim
-
- getSuffix(InputStream, int) - Static method in class org.apache.tika.parser.mp3.LyricsHandler
-
Reads and returns the last length bytes from the
given stream.
- getSupportedMimes() - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- getSupportedMimes() - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
-
The mimes supported by this recogniser
- getSupportedMimes() - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- getSupportedMimes() - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.apple.AppleSingleFileParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.asm.ClassParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.audio.AudioParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.audio.MidiParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.chm.ChmParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.code.SourceCodeParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.crypto.Pkcs7Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.crypto.TSDParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.csv.TextAndCSVParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dbf.DBFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dwg.DWGParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.envi.EnviHeaderParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.epub.EpubContentParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.epub.EpubParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.executable.ExecutableParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.feed.FeedParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.font.AdobeFontMetricParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.font.TrueTypeParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.gdal.GDALParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geoinfo.GeographicInformationParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.grib.GribParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.hdf.HDFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.html.HtmlParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.hwp.HwpV5Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.BPGParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.ICNSParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.ImageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.PSDParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.TiffParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.WebPParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iptc.IptcAnpaParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.isatab.ISArchiveParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork18PackageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.IWorkPackageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.journal.JournalParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.jpeg.JpegParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mail.RFC822Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mat.MatParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mbox.MboxParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mbox.OutlookPSTParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.EMFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.JackcessParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.MSOwnerFileParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.OfficeParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.OldExcelParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.TNEFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.WMFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp3.Mp3Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp4.MP4Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ner.NamedEntityParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.netcdf.NetCDFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.CompressorParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.PackageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.RarParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pot.PooledTimeSeriesParser
-
Returns the set of media types supported by this parser when used with the
given parse context.
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.prt.PRTParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.rtf.RTFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.sas.SAS7BDATParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
Returns the types supported
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.strings.StringsParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.txt.TXTParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.video.FLVParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wordperfect.QuattroProParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xliff.XLIFF12Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xliff.XLZParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.XMLProfiler
-
- getSuppressDuplicateOverlappingText() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getSuppressDuplicateOverlappingText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getSwath() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getSyncBits(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getSystem_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns system uuid
- getTableOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets a table offset
- getTag() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
-
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getTagsPresent() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
Does the file contain this kind of tags?
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getTagString(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Returns the (possibly null padded) String at the given offset and
length.
- getTessdataPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getTesseractPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getText() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getText() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getText() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getText() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Gets the text, if present
- getTextDocument() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
Retrieves the built TextDocument
- getTimeout() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getTimeout() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the maximum time (in seconds) to wait for the "strings" command
to terminate.
- getTitle() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getTitle() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getTotal() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getTrackingMetadata() - Method in class org.apache.tika.parser.mbox.MboxParser
-
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getTrackNumber() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The number of the track within the album / recording
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getType() - Method in class org.apache.tika.parser.image.ICNSType
-
- getType() - Method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
- getType() - Method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
-
- getType() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- getType() - Method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
- getTypeFromVal(int) - Static method in enum org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
-
- getUMLSPass() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the UMLS password.
- getUMLSUser() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the UMLS username.
- getUncompressedLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets uncompressed length
- getUnderline() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- getUnknown() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets unknown
- getUnknown0008() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getUnknown_000c() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns unknown_00c value
- getUnknown_000c() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns 000c unknown bytes
- getUnknown_0024() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns 0024 unknown bytes
- getUnknown_002c() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns 002c unknown bytes
- getUnknown_0044() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns 0044 unknown bytes
- getUnknown_18() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns unknown 18 bytes
- getUnknownLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns unknown length
- getUnknownOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns unknown offset
- getUseSAXDocxExtractor() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getUseSAXDocxExtractor() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getUseSAXPptxExtractor() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns itsf header version
- getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns version of itsp header
- getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns a version of control data block
- getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Returns the version
- getVersion() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
- getVersionCode() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the version code.
- getWidth() - Method in class org.apache.tika.parser.image.ICNSType
-
- getWindow() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getWindowPosition() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getWindowSize() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns a window size
- getWindowSize(int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
LZX supports window sizes of 2^15 (32Kb) through 2^21 (2Mb) Returns X,
i.e 2^X
- getWindowSize() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getWindowsPerReset() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns windows per reset
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
-
Parses the document into a sequence of XHTML SAX events sent to the
given content handler.
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
-
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- getYear() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getYear() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- GlobalIdTableEntry3FNDX - Class in org.apache.tika.parser.microsoft.onenote
-
- GlobalIdTableEntry3FNDX() - Constructor for class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- GlobalIdTableEntryFNDX - Class in org.apache.tika.parser.microsoft.onenote
-
- GlobalIdTableEntryFNDX() - Constructor for class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
-
- GRIB_MIME_TYPE - Static variable in class org.apache.tika.parser.grib.GribParser
-
- GribParser - Class in org.apache.tika.parser.grib
-
- GribParser() - Constructor for class org.apache.tika.parser.grib.GribParser
-
- GrobidNERecogniser - Class in org.apache.tika.parser.ner.grobid
-
- GrobidNERecogniser() - Constructor for class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
- GrobidRESTParser - Class in org.apache.tika.parser.journal
-
- GrobidRESTParser() - Constructor for class org.apache.tika.parser.journal.GrobidRESTParser
-
- ICNS_1024x1024_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_128x128_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_128x128_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_128x128_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_128x128_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x12_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x12_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x12_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_256x256_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_256x256_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_1BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_48x48_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_48x48_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_48x48_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_48x48_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_48x48_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_512x512_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_64x64_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_MIME_TYPE - Static variable in class org.apache.tika.parser.image.ICNSParser
-
- ICNSParser - Class in org.apache.tika.parser.image
-
A basic parser class for Apple ICNS icon files
- ICNSParser() - Constructor for class org.apache.tika.parser.image.ICNSParser
-
- ICNSType - Class in org.apache.tika.parser.image
-
Holds details on Apple ICNS icons
- Icu4jEncodingDetector - Class in org.apache.tika.parser.txt
-
- Icu4jEncodingDetector() - Constructor for class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
- id - Variable in class org.apache.tika.parser.recognition.RecognisedObject
-
Identifier for this object
- id - Variable in class org.apache.tika.parser.rtf.ListDescriptor
-
- ID3Comment(String) - Constructor for class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Creates an ID3 v1 style comment tag
- ID3Comment(String, String, String) - Constructor for class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Creates an ID3 v2 style comment tag
- ID3Tags - Interface in org.apache.tika.parser.mp3
-
Interface that defines the common interface for ID3 tag parsers,
such as ID3v1 and ID3v2.3.
- ID3Tags.ID3Comment - Class in org.apache.tika.parser.mp3
-
Represents a comments in ID3 (especially ID3 v2), where are
made up of several parts
- ID3TagsAndAudio() - Constructor for class org.apache.tika.parser.mp3.Mp3Parser.ID3TagsAndAudio
-
- ID3v1Handler - Class in org.apache.tika.parser.mp3
-
This is used to parse ID3 Version 1 Tag information from an MP3 file,
if available.
- ID3v1Handler(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.ID3v1Handler
-
- ID3v1Handler(byte[]) - Constructor for class org.apache.tika.parser.mp3.ID3v1Handler
-
Creates from the last 128 bytes of a stream.
- ID3v22Handler - Class in org.apache.tika.parser.mp3
-
This is used to parse ID3 Version 2.2 Tag information from an MP3 file,
if available.
- ID3v22Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v22Handler
-
- ID3v23Handler - Class in org.apache.tika.parser.mp3
-
This is used to parse ID3 Version 2.3 Tag information from an MP3 file,
if available.
- ID3v23Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v23Handler
-
- ID3v24Handler - Class in org.apache.tika.parser.mp3
-
This is used to parse ID3 Version 2.4 Tag information from an MP3 file,
if available.
- ID3v24Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v24Handler
-
- ID3v2Frame - Class in org.apache.tika.parser.mp3
-
A frame of ID3v2 data, which is then passed to a handler to
be turned into useful data.
- ID3v2Frame.RawTag - Class in org.apache.tika.parser.mp3
-
- ID3v2Frame.RawTagIterator - Class in org.apache.tika.parser.mp3
-
Iterates over id3v2 raw tags.
- ID3v2Frame.TextEncoding - Class in org.apache.tika.parser.mp3
-
- IdentityHtmlMapper - Class in org.apache.tika.parser.html
-
Alternative HTML mapping rules that pass the input HTML as-is without any
modifications.
- IdentityHtmlMapper() - Constructor for class org.apache.tika.parser.html.IdentityHtmlMapper
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- ImageMetadataExtractor - Class in org.apache.tika.parser.image
-
Uses the
Metadata Extractor library
to read EXIF and IPTC image metadata and map to Tika fields.
- ImageMetadataExtractor(Metadata) - Constructor for class org.apache.tika.parser.image.ImageMetadataExtractor
-
- ImageMetadataExtractor(Metadata, ImageMetadataExtractor.DirectoryHandler...) - Constructor for class org.apache.tika.parser.image.ImageMetadataExtractor
-
- ImageParser - Class in org.apache.tika.parser.image
-
- ImageParser() - Constructor for class org.apache.tika.parser.image.ImageParser
-
- increaseFramesRead() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- incrementLevel(int, AbstractListManager.LevelTuple[]) - Method in class org.apache.tika.parser.microsoft.AbstractListManager.ParagraphLevelCounter
-
Apply this to every numbered paragraph in order.
- indexOf(byte[], byte[]) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
Searches some pattern in byte[]
- indexOf(List<DirectoryListingEntry>, String) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
Searches for some pattern in the directory listing entry list
- indexOfResetTableBlock(byte[], byte[]) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
Returns an index of the reset table
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- initialize(URL) - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
Initializes this parser
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
-
No-op
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
no-op
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.pdf.PDFParser
-
This is a no-op.
- initialize(Map<String, Param>) - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
-
This is the hook for configuring the recogniser
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTVideoRecogniser
-
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
- inputFilterEnabled() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Test whether or not input filtering is enabled.
- INSTANCE - Static variable in class org.apache.tika.parser.html.DefaultHtmlMapper
-
- INSTANCE - Static variable in class org.apache.tika.parser.html.IdentityHtmlMapper
-
- intelE8Decoding() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- IptcAnpaParser - Class in org.apache.tika.parser.iptc
-
Parser for IPTC ANPA New Wire Feeds
- IptcAnpaParser() - Constructor for class org.apache.tika.parser.iptc.IptcAnpaParser
-
- ISArchiveParser - Class in org.apache.tika.parser.isatab
-
- ISArchiveParser() - Constructor for class org.apache.tika.parser.isatab.ISArchiveParser
-
Default constructor.
- ISArchiveParser(String) - Constructor for class org.apache.tika.parser.isatab.ISArchiveParser
-
Constructor that accepts the pathname of ISArchive folder.
- ISATabUtils - Class in org.apache.tika.parser.isatab
-
- ISATabUtils() - Constructor for class org.apache.tika.parser.isatab.ISATabUtils
-
- isAudioHeader(int, int, int, int) - Static method in class org.apache.tika.parser.mp3.AudioFrame
-
Does this appear to be a 4 byte audio frame header?
- isAvailable() - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- isAvailable() - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- isAvailable() - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
- isAvailable() - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
- isAvailable() - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
- isAvailable() - Method in interface org.apache.tika.parser.ner.NERecogniser
-
checks if this Named Entity recogniser is available for service
- isAvailable() - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
-
- isAvailable() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
-
- isAvailable() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- isAvailable() - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
- isAvailable() - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
-
Is this service available
- isAvailable() - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- isAvailable() - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- isBase64() - Method in class org.apache.tika.parser.utils.DataURIScheme
-
- isBold() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- isCatchIntermediateIOExceptions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- isComplete() - Method in class org.apache.tika.parser.csv.CSVParams
-
- isDiscardElement(String) - Method in class org.apache.tika.parser.html.DefaultHtmlMapper
-
- isDiscardElement(String) - Method in interface org.apache.tika.parser.html.HtmlMapper
-
Checks whether all content within the given HTML element should be
discarded instead of including it in the parse output.
- isDiscardElement(String) - Method in class org.apache.tika.parser.html.HtmlParser
-
- isDiscardElement(String) - Method in class org.apache.tika.parser.html.IdentityHtmlMapper
-
- isEmpty(String) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
- isEmpty() - Method in class org.apache.tika.parser.csv.CSVParams
-
- isEnableImageProcessing() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- isHeading() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
-
- isIncludeMarkup() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- isItalics() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- isListenForAllRecords() - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
-
Returns true if this parser is configured to listen
for all records instead of just the specified few.
- isMatchingElement(String, String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- isMatchingParentElement(String, String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- isMetadataField(String) - Static method in class org.apache.tika.parser.image.MetadataFields
-
- isMetadataField(Property) - Static method in class org.apache.tika.parser.image.MetadataFields
-
- isMimetype() - Method in class org.apache.tika.parser.strings.FileConfig
-
Returns true if the mime option is enabled.
- isMSB() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
-
- isPrettyPrint() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns true if formatted output is enabled, false otherwise.
- isSerialize() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns true if CAS serialization is enabled, false otherwise.
- isStrikeThrough() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- isStyle - Variable in class org.apache.tika.parser.rtf.ListDescriptor
-
- isText() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns true if content text analysis is enabled false otherwise.
- isTracking() - Method in class org.apache.tika.parser.mbox.MboxParser
-
- isUnordered(int) - Method in class org.apache.tika.parser.rtf.ListDescriptor
-
- ITSF - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- ITSP - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- IWORK13_COMMON_ENTRY - Static variable in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
-
All iWork 13 files contain this, so we can detect based on it
- IWork13PackageParser - Class in org.apache.tika.parser.iwork.iwana
-
- IWork13PackageParser() - Constructor for class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
-
- IWork13PackageParser.IWork13DocumentType - Enum in org.apache.tika.parser.iwork.iwana
-
- IWork18PackageParser - Class in org.apache.tika.parser.iwork.iwana
-
For now, this parser isn't even registered.
- IWork18PackageParser() - Constructor for class org.apache.tika.parser.iwork.iwana.IWork18PackageParser
-
- IWork18PackageParser.IWork18DocumentType - Enum in org.apache.tika.parser.iwork.iwana
-
- IWORK_COMMON_ENTRY - Static variable in class org.apache.tika.parser.iwork.IWorkPackageParser
-
All iWork files contain one of these, so we can detect based on it
- IWORK_CONTENT_ENTRIES - Static variable in class org.apache.tika.parser.iwork.IWorkPackageParser
-
Which files within an iWork file contain the actual content?
- IWorkPackageParser - Class in org.apache.tika.parser.iwork
-
A parser for the IWork container files.
- IWorkPackageParser() - Constructor for class org.apache.tika.parser.iwork.IWorkPackageParser
-
- IWorkPackageParser.IWORKDocumentType - Enum in org.apache.tika.parser.iwork
-
- salvageCopy(InputStream, File) - Static method in class org.apache.tika.parser.utils.ZipSalvager
-
This streams the broken zip and rebuilds a new zip that
is at least a valid zip file.
- salvageCopy(File, File) - Static method in class org.apache.tika.parser.utils.ZipSalvager
-
- SAS7BDATParser - Class in org.apache.tika.parser.sas
-
Processes the SAS7BDAT data columnar database file used by SAS and
other similar languages.
- SAS7BDATParser() - Constructor for class org.apache.tika.parser.sas.SAS7BDATParser
-
- SDA - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
StarOffice Draw
- SDC - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
StarOffice Calc
- SDD - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
StarOffice Impress
- SDW - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
StarOffice Writer
- searchGeoNames(ArrayList<String>) - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- secondaryParser - Variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- SentimentAnalysisParser - Class in org.apache.tika.parser.sentiment
-
This parser classifies documents based on the sentiment of document.
- SentimentAnalysisParser() - Constructor for class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
- serialize(JCas, CTAKESSerializer, boolean, OutputStream) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Serializes a CAS in the given format.
- setAccessChecker(AccessChecker) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setAdmin1Code(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setAdmin2Code(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setAeDescriptorPath(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the path to XML descriptor for AnalysisEngine.
- setAlignedLenTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setAlignedTreeTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setAnnotationProps(CTAKESAnnotationProperty[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- setAnnotationProps(String[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- setApplyRotation(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Sets whether or not a rotation value should be calculated and passed to ImageMagick.
- setApplyRotation(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setAverageCharTolerance(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
See PDFTextStripper.setAverageCharTolerance(float)
- setBlock_len(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets block length
- setBlockAddress(long[]) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets block addresses
- setBlockCount(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets a block count
- setBlockidx_intvl(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets block index interval
- setBlockLength(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setBlockLlen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets a block length
- setBlockNext(int) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setBlockPrev(int) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setBlockRemaining(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setBlockType(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setBold(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- setByteArrayMaxOverride(int) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
WARNING: this sets a static variable in POI.
- setCatchIntermediateIOExceptions(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
The PDFBox parser will throw an IOException if there is
a problem with a stream.
- setCenter(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- setCharset(Charset) - Method in class org.apache.tika.parser.csv.CSVParams
-
- setChmDirList(ChmDirectoryListingSet) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setChmItsfHeader(ChmItsfHeader) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setChmItspHeader(ChmItspHeader) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setChmLzxcControlData(ChmLzxcControlData) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setChmLzxcResetTable(ChmLzxcResetTable) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setColorspace(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setColorspace(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setCommand(String) - Method in class org.apache.tika.parser.gdal.GDALParser
-
- setCompressedLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets compressed length
- setConcatenatePhoneticRuns(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setConcatenatePhoneticRuns(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Microsoft Excel files can sometimes contain phonetic (furigana) strings.
- setConfidence(double) - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- setContentLength(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- setContentParser(Parser) - Method in class org.apache.tika.parser.epub.EpubParser
-
- setContentParser(Parser) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
-
- setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
-
- setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
-
- setControlDataIndex(int) - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Sets control data index
- setCountryCode(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setData(byte[]) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setDataOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets data offset
- setDateFormatOverride(String) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setDateFormatOverride(String) - Method in class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
-
- setDateOverrideFormat(String) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
A user may wish to override the date formats in xls and xlsx files.
- setDeclaredEncoding(String) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Set the declared encoding for charset detection.
- setDelimiter(Character) - Method in class org.apache.tika.parser.csv.CSVParams
-
- setDensity(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setDensity(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setDepth(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setDepth(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setDetectableCharset(String, boolean) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
- setDetectAngles(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setDir_uuid(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets directory uuid
- setDirectoryListingEntryList(List<DirectoryListingEntry>) - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Sets chm directory listing entry list
- setDirLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets directory length
- setDirOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets directory offset
- setDocumentLocator(Locator) - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- setDocumentLocator(Locator) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- setEnableAutoSpace(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setEnableAutoSpace(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), the parser should estimate
where spaces should be inserted between words.
- setEnableImageProcessing(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the value to true if processing is to be enabled.
- setEnableImageProcessing(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setEncoding(StringsEncoding) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the character encoding of the strings that are to be found.
- setEntriesToCopy(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- setEntryType(ChmCommons.EntryType) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- setExtractAcroFormContent(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), extract content from AcroForms
at the end of the document.
- setExtractActions(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Whether or not to extract PDActions from the file.
- setExtractAllAlternatives(boolean) - Method in class org.apache.tika.parser.mail.RFC822Parser
-
Until version 1.17, Tika handled all body parts as embedded objects (see TIKA-2478).
- setExtractAllAlternativesFromMSG(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
Some .msg files can contain body content in html, rtf and/or text.
- setExtractAllAlternativesFromMSG(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Some .msg files can contain body content in html, rtf and/or text.
- setExtractAnnotationText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setExtractAnnotationText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), text in annotations will be
extracted.
- setExtractBookmarksText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, extract bookmarks (document outline) text.
- setExtractFontNames(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Extract font names into a metadata field
- setExtractInlineImages(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, extract inline embedded OBXImages.
- setExtractMacros(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setExtractMacros(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Sets whether or not MSOffice parsers should extract macros.
- setExtractMarkedContent(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If the PDF contains marked content, try to extract text and its marked structure.
- setExtractScripts(boolean) - Method in class org.apache.tika.parser.html.HtmlParser
-
Whether or not to extract contents in script entities.
- setExtractUniqueInlineImagesOnly(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Multiple pages within a PDF file might refer to the same underlying image.
- setFilePath(String) - Method in class org.apache.tika.parser.strings.FileConfig
-
Sets the "file" installation folder.
- setFilter(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setFilter(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setFramesRead(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setFreeSpace(long) - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Sets pmgi free space
- setFreeSpace(long) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setGazetteerRestEndpoint(String) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
Configure REST endpoint for lucene-geo-gazetteer
- setGuid(GUID) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
-
- setHadStarted(ChmCommons.LzxState) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setHeader_len(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets itsp header length
- setHeaderLen(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets itsf header length
- setId(String) - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- setIfXFAExtractOnlyXFA(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If false (the default), extract content from the full PDF
as well as the XFA form.
- setIlvl(int) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- setImageMagickPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the path to the ImageMagick executable directory, needed if it is not on system path.
- setImageMagickPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Sets whether or not the parser should include deleted content.
- setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
-
Whether or not to include deleted content.
- setIncludeHeadersAndFooters(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to include headers and footers.
- setIncludeMarkup(boolean) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- setIncludeMissingRows(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
For table-like formats, and tables within other formats, should
missing rows in sparse tables be output where detected?
The default is to only output rows defined within the file, which
avoid lots of blank lines, but means layout isn't preserved.
- setIncludeMoveFromContent(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setIncludeMoveFromContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
With track changes on, when a section is moved, the content
is stored in both the "moveFrom" section and in the "moveTo" section.
- setIncludeShapeBasedContent(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setIncludeShapeBasedContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
In Excel and Word, there can be text stored within drawing shapes.
- setIncludeSlideMasterContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to include contents from any of the three
types of masters -- slide, notes, handout -- in a .ppt or ppt[xm] file.
- setIncludeSlideNotes(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to process slide notes content.
- setIndex(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
-
- setIndex_depth(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets an index depth
- setIndex_head(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets an index head
- setIndex_root(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets an index root
- setIndexCopyFromStart(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- setIndexCopyToStart(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- setIndexOfContent(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setIndexOfResetData(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setIndexOfResetTable(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setInitializableProblemHandler(InitializableProblemHandler) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setIntelCurrentPossition(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setIntelFileSize(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setIntelState(ChmCommons.IntelState) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setItalics(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- setLabel(String) - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- setLabelLang(String) - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- setLang_id(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets language id
- setLangId(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets language_id
- setLanguage(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set tesseract language dictionary to be used.
- setLanguage(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setLastModified(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets last modified date of the chm file
- setLatitude(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setLeft(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- setLength(int) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- setLengthTreeLengtsTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setLengthTreeTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setListenForAllRecords(boolean) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
-
Specifies whether this parser should to listen for all
records or just for the specified few.
- setLongitude(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setLzxBlockLength(long) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setLzxBlockOffset(long) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setLzxBlocksCache(List<ChmLzxBlock>) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setMain(String, String, String) - Method in class org.apache.tika.parser.geo.topic.GeoTag
-
- setMainTreeElements(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setMainTreeLengtsTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setMainTreeTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setMarkLimit(int) - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
-
How far into the stream to read for charset detection.
- setMarkLimit(int) - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
-
How far into the stream to read for charset detection.
- setMarkLimit(int) - Method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
- setMarkLimit(int) - Method in class org.apache.tika.parser.pkg.ZipContainerDetector
-
If this is less than 0, the file will be spooled to disk,
and detection will run on the full file.
- setMarkLimit(int) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
How far into the stream to read for charset detection.
- setMarkLimit(int) - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
-
How far into the stream to read for charset detection.
- setMaxBytesForEmbeddedObject(int) - Static method in class org.apache.tika.parser.rtf.RTFParser
-
- setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set maximum file size to submit file to ocr.
- setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setMaxMainMemoryBytes(long) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setMaxMainMemoryBytes(int) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setMaxMainMemoryBytes(long) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setMaxXMPMMHistory(int) - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
-
Maximum number of events to extract from the
event history in the XMP Media Management (XMPMM) section.
- setMediaType(MediaType) - Method in class org.apache.tika.parser.csv.CSVParams
-
- setMemoryLimitInKb(int) - Method in class org.apache.tika.parser.pkg.CompressorParser
-
- setMemoryLimitInKb(int) - Method in class org.apache.tika.parser.rtf.RTFParser
-
- setMetadata(String[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the metadata whose values will be analyzed using cTAKES.
- setMetaParser(Parser) - Method in class org.apache.tika.parser.epub.EpubParser
-
- setMetaParser(Parser) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- setMimetype(boolean) - Method in class org.apache.tika.parser.strings.FileConfig
-
Sets the mime option.
- setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set minimum file size to submit file to ocr.
- setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setMinLength(int) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the minimum sequence length (characters) to print.
- setMinSize(int) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
-
Sets the minimum size of a character sequence to be extracted.
- setName(String) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Sets entry name
- setName(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setNameLength(int) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Sets an entry name length
- setNERModelPath(String) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- setNerModelUrl(URL) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- setNum_blocks(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets number of blocks containing in the chm file
- setNumId(int) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- setOcrDPI(int) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Dots per inch used to render the page image for OCR.
- setOcrImageFormatName(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setOcrImageQuality(float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image quality used to render the page image for OCR.
- setOcrImageScale(float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setOcrImageType(String) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setOcrImageType(ImageType) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image type used to render the page image for OCR.
- setOcrImageType(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image type used to render the page image for OCR.
- setOcrStrategy(String) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setOcrStrategy(PDFParserConfig.OCR_STRATEGY) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Which strategy to use for OCR
- setOcrStrategy(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Which strategy to use for OCR
- setOffset(int) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- setOutputStream(OutputStream) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- setOutputType(TesseractOCRConfig.OUTPUT_TYPE) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set output type from ocr process.
- setOutputType(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setOutputType(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setPageSegMode(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set tesseract page segmentation mode.
- setPageSegMode(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setPageSeparator(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
The page separator to use in plain text output.
- setPDFParserConfig(PDFParserConfig) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setPersonAndEmail(String, Property, Property, Metadata) - Static method in class org.apache.tika.parser.mail.MailUtil
-
This tries to split a "from" or "to" value into a person field and an email field.
- setPreserveInterwordSpacing(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Whether or not to maintain interword spacing.
- setPreserveInterwordSpacing(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setPrettyPrint(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Enables the formatted output for serializer.
- setR0(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setR1(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setR2(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setRecogniser(String) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- setResetInterval(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets a reset interval
- setResetTableIndex(int) - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Sets reset table index
- setResize(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setResize(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setRight(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- setSeparatorChar(char) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the separator character used for annotation properties.
- setSerialize(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Enables CAS serialization.
- setSerializerType(CTAKESSerializer) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the type of cTAKES (UIMA) serializer used to write CAS.
- setSetKCMS(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Whether to call System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider").
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets itsf header signature
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets itsp signature
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets a signature of control data block
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Sets pmgi signature
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setSize(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets a size of control data
- setSortByPosition(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setSortByPosition(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, sort text tokens by their x/y position
before extracting text.
- setSpacingTolerance(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
See PDFTextStripper.setSpacingTolerance(float)
- setStartIndex(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setStream_uuid(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets stream uuid
- setStrike(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- setStringsPath(String) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the "strings" installation folder.
- setStripMarkup(boolean) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
Whether or not to attempt to strip html-ish markup
from the stream before sending it to the underlying
detector.
- setStyleID(String) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- setSuppressDuplicateOverlappingText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setSuppressDuplicateOverlappingText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, the parser should try to remove duplicated
text over the same region.
- setSwath(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- setSystem_uuid(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets system uuid
- setTableOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets a table offset
- setTessdataPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the path to the 'tessdata' folder, which contains language files and config files.
- setTessdataPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setTesseractPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the path to the Tesseract executable's directory, needed if it is not on system path.
- setTesseractPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setText(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Enables content text analysis using cTAKES.
- setText(byte[]) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Set the input text (byte) data whose charset is to be detected.
- setText(InputStream) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Set the input text (byte) data whose charset is to be detected.
- setTimeout(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set maximum time (seconds) to wait for the ocring process to terminate.
- setTimeout(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setTimeout(int) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the maximum time (in seconds) to wait for the "strings" command to
terminate.
- setTotal(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- setTracking(boolean) - Method in class org.apache.tika.parser.mbox.MboxParser
-
- setTrustedPageSeparator(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setUMLSPass(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the UMLS password.
- setUMLSUser(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the UMLS username.
- setUncompressedLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets uncompressed length
- setUnderline(String) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- setUnknown(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets an unknown
- setUnknown0008(long) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setUnknown_000c(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets unknown_00c
- setUnknown_000c(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets 000c unknown bytes Unknown means here that those guys who cracked
the chm format do not know what's it purposes for
- setUnknown_0024(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets 0024 unknown bytes
- setUnknown_002c(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets 002c unknown bytes
- setUnknown_0044(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets 0044 unknown bytes
- setUnknown_18(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets unknown 18 bytes
- setUnknownLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets unknown length
- setUnknownOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets unknown offset
- setUseSAXDocxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setUseSAXDocxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Use the experimental SAX-based streaming DOCX parser?
If set to false, the classic parser will be used; if true,
the new experimental parser will be used.
- setUseSAXPptxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setUseSAXPptxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Use the experimental SAX-based streaming DOCX parser?
If set to false, the classic parser will be used; if true,
the new experimental parser will be used.
- setVersion(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets itsf version
- setVersion(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets a version of itsp header
- setVersion(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets version of control data block
- setVersion(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets the version
- setWindow(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setWindowPosition(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setWindowSize(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets a window size
- setWindowSize(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setWindowsPerReset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets windows per reset
- sheetParts - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- SheetTextAsHTML(OfficeParserConfig, XHTMLContentHandler) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
-
- SIGNATURE_RELATIONSHIP - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
-
- size() - Method in class org.apache.tika.parser.mp4.DirectFileReadDataSource
-
- skippedEntity(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- SLDWORKS - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
SolidWorks CAD file
- SourceCodeParser - Class in org.apache.tika.parser.code
-
Generic Source code parser for Java, Groovy, C++.
- SourceCodeParser() - Constructor for class org.apache.tika.parser.code.SourceCodeParser
-
- SourceCodeParser(EncodingDetector) - Constructor for class org.apache.tika.parser.code.SourceCodeParser
-
- SpreadsheetMLParser - Class in org.apache.tika.parser.microsoft.xml
-
Parses wordml 2003 format Excel files.
- SpreadsheetMLParser() - Constructor for class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
-
- SQLite3Parser - Class in org.apache.tika.parser.jdbc
-
This is the main class for parsing SQLite3 files.
- SQLite3Parser() - Constructor for class org.apache.tika.parser.jdbc.SQLite3Parser
-
Checks to see if class is available for org.sqlite.JDBC.
- StandardHtmlEncodingDetector - Class in org.apache.tika.parser.html.charsetdetector
-
An encoding detector that tries to respect the spirit of the HTML spec
part 12.2.3 "The input byte stream", or at least the part that is compatible with
the implementation of tika.
- StandardHtmlEncodingDetector() - Constructor for class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
-
- start(BundleContext) - Method in class org.apache.tika.parser.internal.Activator
-
- START_PMGL - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- startBookmark(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startBookmark(String, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startDocument() - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- startDocument() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- startDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- startDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- startDocument() - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
-
- startEditedSection(String, Date, OOXMLWordAndPowerPointTextHandler.EditType) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startEditedSection(String, Date, OOXMLWordAndPowerPointTextHandler.EditType) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.AttributeMetadataHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- startParagraph(ParagraphProperties) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startParagraph(ParagraphProperties) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
-
- startRow(int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
-
- startSDT() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startSDT() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startsWith(byte[], String) - Static method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
- startTable() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startTable() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startTableCell() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startTableCell() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startTableRow() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startTableRow() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- stop(BundleContext) - Method in class org.apache.tika.parser.internal.Activator
-
- StreamingZipContainerDetector - Class in org.apache.tika.parser.pkg
-
- StreamingZipContainerDetector() - Constructor for class org.apache.tika.parser.pkg.StreamingZipContainerDetector
-
- StringsConfig - Class in org.apache.tika.parser.strings
-
Configuration for the "strings" (or strings-alternative) command.
- StringsConfig() - Constructor for class org.apache.tika.parser.strings.StringsConfig
-
Default contructor.
- StringsConfig(InputStream) - Constructor for class org.apache.tika.parser.strings.StringsConfig
-
Loads properties from InputStream and then tries to close InputStream.
- StringsEncoding - Enum in org.apache.tika.parser.strings
-
Character encoding of the strings that are to be found using the "strings" command.
- StringsParser - Class in org.apache.tika.parser.strings
-
Parser that uses the "strings" (or strings-alternative) command to find the
printable strings in a object, or other binary, file
(application/octet-stream).
- StringsParser() - Constructor for class org.apache.tika.parser.strings.StringsParser
-
- stringToAsciiBytes(String) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- STYLE_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- SUMMARY_PROPERTY_PREFIX - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
-
- SummaryExtractor - Class in org.apache.tika.parser.microsoft
-
Extractor for Common OLE2 (HPSF) metadata
- SummaryExtractor(Metadata) - Constructor for class org.apache.tika.parser.microsoft.SummaryExtractor
-
- SUPPORTED_TYPES - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
-
- SUPPORTED_TYPES - Static variable in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
-
- SVG_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- SXSLFPowerPointExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
-
SAX/Streaming pptx extractior
- SXSLFPowerPointExtractorDecorator(Metadata, ParseContext, XSLFEventBasedPowerPointExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
-
- SXWPFWordExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
-
This is an experimental, alternative extractor for docx files.
- SXWPFWordExtractorDecorator(Metadata, ParseContext, XWPFEventBasedWordExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
-
- SYS_PROP_NER_IMPL - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- valueOf(String) - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.EntryType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.IntelState
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.LzxState
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.FormattingUtils.Tag
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.Error
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.EditType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.ocr.TesseractOCRConfig.OUTPUT_TYPE
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.pdf.PDFParserConfig.OCR_STRATEGY
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.strings.StringsEncoding
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.utils.CommonsDigester.DigestAlgorithm
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.EntryType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.IntelState
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.LzxState
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.FormattingUtils.Tag
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.onenote.Error
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.EditType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.ocr.TesseractOCRConfig.OUTPUT_TYPE
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.pdf.PDFParserConfig.OCR_STRATEGY
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.strings.StringsEncoding
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.utils.CommonsDigester.DigestAlgorithm
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- VERBATIM - Static variable in class org.apache.tika.parser.chm.core.ChmCommons
-
- VSD - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Microsoft Visio