Package org.apache.poi.ooxml.extractor
Class POIXMLExtractorFactory
- java.lang.Object
-
- org.apache.poi.ooxml.extractor.POIXMLExtractorFactory
-
- All Implemented Interfaces:
ExtractorProvider
public final class POIXMLExtractorFactory extends java.lang.Object implements ExtractorProvider
Figures out the correct POITextExtractor for your supplied document, and returns it.Note 1 - will fail for many file formats if the POI Scratchpad jar is not present on the runtime classpath
Note 2 - rather than using this, for most cases you would be better off switching to Apache Tika instead!
-
-
Constructor Summary
Constructors Constructor Description POIXMLExtractorFactory()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanaccepts(FileMagic fm)POITextExtractorcreate(java.io.File f, java.lang.String password)Create Extractor via filePOITextExtractorcreate(java.io.InputStream inp, java.lang.String password)Create Extractor via InputStreamPOIXMLTextExtractorcreate(OPCPackage pkg)Tries to determine the actual type of file and produces a matching text-extractor for it.POITextExtractorcreate(DirectoryNode poifsDir, java.lang.String password)Create Extractor from POIFS nodePOITextExtractorcreate(POIFSFileSystem fs)static java.lang.BooleangetAllThreadsPreferEventExtractors()Should all threads prefer event based over usermodel based extractors? (usermodel extractors tend to be more accurate, but use more memory) Default is to use the thread level setting, which defaults to false.static booleangetPreferEventExtractor()Should this thread use event based extractors is available? Checks the all-threads one first, then thread specific.static booleangetThreadPrefersEventExtractors()Should this thread prefer event based over usermodel based extractors? (usermodel extractors tend to be more accurate, but use more memory) Default is false.static voidsetAllThreadsPreferEventExtractors(java.lang.Boolean preferEventExtractors)Should all threads prefer event based over usermodel based extractors? If set, will take preference over the Thread level setting.static voidsetThreadPrefersEventExtractors(boolean preferEventExtractors)Should this thread prefer event based over usermodel based extractors? Will only be used if the All Threads setting is null.-
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.poi.extractor.ExtractorProvider
identifyEmbeddedResources
-
-
-
-
Method Detail
-
accepts
public boolean accepts(FileMagic fm)
- Specified by:
acceptsin interfaceExtractorProvider
-
getThreadPrefersEventExtractors
public static boolean getThreadPrefersEventExtractors()
Should this thread prefer event based over usermodel based extractors? (usermodel extractors tend to be more accurate, but use more memory) Default is false.
-
getAllThreadsPreferEventExtractors
public static java.lang.Boolean getAllThreadsPreferEventExtractors()
Should all threads prefer event based over usermodel based extractors? (usermodel extractors tend to be more accurate, but use more memory) Default is to use the thread level setting, which defaults to false.
-
setThreadPrefersEventExtractors
public static void setThreadPrefersEventExtractors(boolean preferEventExtractors)
Should this thread prefer event based over usermodel based extractors? Will only be used if the All Threads setting is null.
-
setAllThreadsPreferEventExtractors
public static void setAllThreadsPreferEventExtractors(java.lang.Boolean preferEventExtractors)
Should all threads prefer event based over usermodel based extractors? If set, will take preference over the Thread level setting.
-
getPreferEventExtractor
public static boolean getPreferEventExtractor()
Should this thread use event based extractors is available? Checks the all-threads one first, then thread specific.
-
create
public POITextExtractor create(java.io.File f, java.lang.String password) throws java.io.IOException
Description copied from interface:ExtractorProviderCreate Extractor via file- Specified by:
createin interfaceExtractorProvider- Parameters:
f- the filepassword- the password ornullif not encrypted- Returns:
- the extractor
- Throws:
java.io.IOException- if file can't be read or parsed
-
create
public POITextExtractor create(java.io.InputStream inp, java.lang.String password) throws java.io.IOException
Description copied from interface:ExtractorProviderCreate Extractor via InputStream- Specified by:
createin interfaceExtractorProvider- Parameters:
inp- the streampassword- the password ornullif not encrypted- Returns:
- the extractor
- Throws:
java.io.IOException- if stream can't be read or parsed
-
create
public POIXMLTextExtractor create(OPCPackage pkg) throws java.io.IOException
Tries to determine the actual type of file and produces a matching text-extractor for it.- Parameters:
pkg- AnOPCPackage.- Returns:
- A
POIXMLTextExtractorfor the given file. - Throws:
java.io.IOException- If an error occurs while reading the filejava.lang.IllegalArgumentException- If no matching file type could be found.
-
create
public POITextExtractor create(POIFSFileSystem fs) throws java.io.IOException
- Throws:
java.io.IOException
-
create
public POITextExtractor create(DirectoryNode poifsDir, java.lang.String password) throws java.io.IOException
Description copied from interface:ExtractorProviderCreate Extractor from POIFS node- Specified by:
createin interfaceExtractorProvider- Parameters:
poifsDir- the nodepassword- the password ornullif not encrypted- Returns:
- the extractor
- Throws:
java.io.IOException- if node can't be parsed
-
-