Package org.apache.poi.ooxml.extractor
Class POIXMLExtractorFactory
java.lang.Object
org.apache.poi.ooxml.extractor.POIXMLExtractorFactory
- All Implemented Interfaces:
ExtractorProvider
Figures out the correct POITextExtractor for your supplied
document, and returns it.
Note 1 - will fail for many file formats if the POI Scratchpad jar is not present on the runtime classpath
Note 2 - rather than using this, for most cases you would be better off switching to Apache Tika instead!
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionbooleanCreate Extractor via filecreate(InputStream inp, String password) Create Extractor via InputStreamcreate(OPCPackage pkg) Tries to determine the actual type of file and produces a matching text-extractor for it.create(DirectoryNode poifsDir, String password) Create Extractor from POIFS nodestatic BooleanShould all threads prefer event based over usermodel based extractors? (usermodel extractors tend to be more accurate, but use more memory) Default is to use the thread level setting, which defaults to false.static booleanShould this thread use event based extractors is available? Checks the all-threads one first, then thread specific.static booleanShould this thread prefer event based over usermodel based extractors? (usermodel extractors tend to be more accurate, but use more memory) Default is false.static voidsetAllThreadsPreferEventExtractors(Boolean preferEventExtractors) Should all threads prefer event based over usermodel based extractors? If set, will take preference over the Thread level setting.static voidsetThreadPrefersEventExtractors(boolean preferEventExtractors) Should this thread prefer event based over usermodel based extractors? Will only be used if the All Threads setting is null.Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.poi.extractor.ExtractorProvider
identifyEmbeddedResources
-
Constructor Details
-
POIXMLExtractorFactory
public POIXMLExtractorFactory()
-
-
Method Details
-
accepts
- Specified by:
acceptsin interfaceExtractorProvider
-
getThreadPrefersEventExtractors
public static boolean getThreadPrefersEventExtractors()Should this thread prefer event based over usermodel based extractors? (usermodel extractors tend to be more accurate, but use more memory) Default is false. -
getAllThreadsPreferEventExtractors
Should all threads prefer event based over usermodel based extractors? (usermodel extractors tend to be more accurate, but use more memory) Default is to use the thread level setting, which defaults to false. -
setThreadPrefersEventExtractors
public static void setThreadPrefersEventExtractors(boolean preferEventExtractors) Should this thread prefer event based over usermodel based extractors? Will only be used if the All Threads setting is null. -
setAllThreadsPreferEventExtractors
Should all threads prefer event based over usermodel based extractors? If set, will take preference over the Thread level setting. -
getPreferEventExtractor
public static boolean getPreferEventExtractor()Should this thread use event based extractors is available? Checks the all-threads one first, then thread specific. -
create
Description copied from interface:ExtractorProviderCreate Extractor via file- Specified by:
createin interfaceExtractorProvider- Parameters:
f- the filepassword- the password ornullif not encrypted- Returns:
- the extractor
- Throws:
IOException- if file can't be read or parsed
-
create
Description copied from interface:ExtractorProviderCreate Extractor via InputStream- Specified by:
createin interfaceExtractorProvider- Parameters:
inp- the streampassword- the password ornullif not encrypted- Returns:
- the extractor
- Throws:
IOException- if stream can't be read or parsed
-
create
Tries to determine the actual type of file and produces a matching text-extractor for it.- Parameters:
pkg- AnOPCPackage.- Returns:
- A
POIXMLTextExtractorfor the given file. - Throws:
IOException- If an error occurs while reading the fileIllegalArgumentException- If no matching file type could be found.
-
create
- Throws:
IOException
-
create
Description copied from interface:ExtractorProviderCreate Extractor from POIFS node- Specified by:
createin interfaceExtractorProvider- Parameters:
poifsDir- the nodepassword- the password ornullif not encrypted- Returns:
- the extractor
- Throws:
IOException- if node can't be parsed
-