- setAccessChecker(AccessChecker) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setAverageCharTolerance(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
See PDFTextStripper.setAverageCharTolerance(float)
- setCatchIntermediateIOExceptions(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
The PDFBox parser will throw an IOException if there is
a problem with a stream.
- setDetectAngles(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setDropThreshold(float) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setDropThreshold(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
See PDFTextStripper.setDropThreshold(float)
- setEnableAutoSpace(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
If true (the default), the parser should estimate
where spaces should be inserted between words.
- setEnableAutoSpace(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), the parser should estimate
where spaces should be inserted between words.
- setExtractAcroFormContent(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), extract content from AcroForms
at the end of the document.
- setExtractActions(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Whether or not to extract PDActions from the file.
- setExtractAnnotationText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
If true (the default), text in annotations will be
extracted.
- setExtractAnnotationText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), text in annotations will be
extracted.
- setExtractBookmarksText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, extract bookmarks (document outline) text.
- setExtractFontNames(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Extract font names into a metadata field
- setExtractInlineImages(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, extract the literal inline embedded OBXImages.
- setExtractMarkedContent(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If the PDF contains marked content, try to extract text and its marked structure.
- setExtractUniqueInlineImagesOnly(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Multiple pages within a PDF file might refer to the same underlying image.
- setIfXFAExtractOnlyXFA(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If false (the default), extract content from the full PDF
as well as the XFA form.
- setMaxMainMemoryBytes(long) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setMaxMainMemoryBytes(long) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setOcrDPI(int) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Dots per inch used to render the page image for OCR.
- setOcrImageFormatName(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setOcrImageQuality(float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image quality used to render the page image for OCR.
- setOcrImageType(String) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setOcrImageType(ImageType) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image type used to render the page image for OCR.
- setOcrImageType(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image type used to render the page image for OCR.
- setOcrRenderingStrategy(String) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setOcrRenderingStrategy(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setOcrRenderingStrategy(PDFParserConfig.OCR_RENDERING_STRATEGY) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
When rendering the page for OCR, do you want to include the rendering of the electronic text,
ALL, or do you only want to run OCR on the images and vector graphics (NO_TEXT)?
- setOcrStrategy(String) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setOcrStrategy(PDFParserConfig.OCR_STRATEGY) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Which strategy to use for OCR
- setOcrStrategy(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Which strategy to use for OCR
- setOcrStrategyAuto(String) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setOcrStrategyAuto(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setPDFParserConfig(PDFParserConfig) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setSetKCMS(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Whether to call System.setProperty("sun.java2d.cmm",
"sun.java2d.cmm.kcms.KcmsServiceProvider").
- setSortByPosition(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
If true, sort text tokens by their x/y position
before extracting text.
- setSortByPosition(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, sort text tokens by their x/y position
before extracting text.
- setSpacingTolerance(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
See PDFTextStripper.setSpacingTolerance(float)
- setSuppressDuplicateOverlappingText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
If true, the parser should try to remove duplicated
text over the same region.
- setSuppressDuplicateOverlappingText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, the parser should try to remove duplicated
text over the same region.