Class XWPFEventBasedWordExtractor
java.lang.Object
org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
- All Implemented Interfaces:
Closeable,AutoCloseable,POITextExtractor,POIXMLTextExtractor
Experimental class that is based on POI's XSSFEventBasedExcelExtractor
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionReturns the core document propertiesReturns the custom document propertiesReturns opened documentReturns the extended document propertiesReturns the opened OPCPackage that contains the documentgetText()Retrieves all the text from the document.booleanvoidsetCloseFilesystem(boolean b) Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.poi.ooxml.extractor.POIXMLTextExtractor
checkMaxTextSize, close, getMetadataTextExtractor
-
Constructor Details
-
XWPFEventBasedWordExtractor
public XWPFEventBasedWordExtractor(OPCPackage container) throws XmlException, OpenXML4JException, IOException
-
-
Method Details
-
getPackage
Description copied from interface:POIXMLTextExtractorReturns the opened OPCPackage that contains the document- Specified by:
getPackagein interfacePOIXMLTextExtractor- Returns:
- the opened OPCPackage
-
getCoreProperties
Description copied from interface:POIXMLTextExtractorReturns the core document properties- Specified by:
getCorePropertiesin interfacePOIXMLTextExtractor- Returns:
- the core document properties
-
getExtendedProperties
Description copied from interface:POIXMLTextExtractorReturns the extended document properties- Specified by:
getExtendedPropertiesin interfacePOIXMLTextExtractor- Returns:
- the extended document properties
-
getCustomProperties
Description copied from interface:POIXMLTextExtractorReturns the custom document properties- Specified by:
getCustomPropertiesin interfacePOIXMLTextExtractor- Returns:
- the custom document properties
-
getDocument
Description copied from interface:POIXMLTextExtractorReturns opened document- Specified by:
getDocumentin interfacePOITextExtractor- Specified by:
getDocumentin interfacePOIXMLTextExtractor- Returns:
- the opened document
-
getText
Description copied from interface:POITextExtractorRetrieves all the text from the document. How cells, paragraphs etc are separated in the text is implementation specific - see the javadocs for a specific project for details.- Specified by:
getTextin interfacePOITextExtractor- Returns:
- All the text from the document
-
setCloseFilesystem
public void setCloseFilesystem(boolean b) - Specified by:
setCloseFilesystemin interfacePOITextExtractor- Parameters:
b-true(default), if underlying resources/filesystem should be closed onPOITextExtractor.close()
-
isCloseFilesystem
public boolean isCloseFilesystem()- Specified by:
isCloseFilesystemin interfacePOITextExtractor- Returns:
true, if resources/filesystem should be closed onPOITextExtractor.close()
-
getFilesystem
- Specified by:
getFilesystemin interfacePOITextExtractor- Returns:
- The underlying resources/filesystem
-