Package org.apache.poi.xwpf.extractor
Class XWPFWordExtractor
java.lang.Object
org.apache.poi.xwpf.extractor.XWPFWordExtractor
- All Implemented Interfaces:
Closeable,AutoCloseable,POITextExtractor,POIXMLTextExtractor
Helper class to extract text from an OOXML Word file
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionXWPFWordExtractor(OPCPackage container) XWPFWordExtractor(XWPFDocument document) -
Method Summary
Modifier and TypeMethodDescriptionvoidvoidappendParagraphText(StringBuilder text, XWPFParagraph paragraph) Returns opened documentgetText()Retrieves all the text from the document.booleanvoidsetCloseFilesystem(boolean doCloseFilesystem) voidsetConcatenatePhoneticRuns(boolean concatenatePhoneticRuns) Should we concatenate phonetic runs in extraction.voidsetFetchHyperlinks(boolean fetch) Should we also fetch the hyperlinks, when fetching the text content? Default is to only output the hyperlink label, and not the contentsMethods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.poi.ooxml.extractor.POIXMLTextExtractor
checkMaxTextSize, close, getCoreProperties, getCustomProperties, getExtendedProperties, getMetadataTextExtractor, getPackage
-
Field Details
-
SUPPORTED_TYPES
-
-
Constructor Details
-
XWPFWordExtractor
- Throws:
IOException
-
XWPFWordExtractor
-
-
Method Details
-
setFetchHyperlinks
public void setFetchHyperlinks(boolean fetch) Should we also fetch the hyperlinks, when fetching the text content? Default is to only output the hyperlink label, and not the contents -
setConcatenatePhoneticRuns
public void setConcatenatePhoneticRuns(boolean concatenatePhoneticRuns) Should we concatenate phonetic runs in extraction. Default istrue- Parameters:
concatenatePhoneticRuns- If phonetic runs should be concatenated
-
getText
Description copied from interface:POITextExtractorRetrieves all the text from the document. How cells, paragraphs etc are separated in the text is implementation specific - see the javadocs for a specific project for details.- Specified by:
getTextin interfacePOITextExtractor- Returns:
- All the text from the document
-
appendBodyElementText
-
appendParagraphText
-
getDocument
Description copied from interface:POIXMLTextExtractorReturns opened document- Specified by:
getDocumentin interfacePOITextExtractor- Specified by:
getDocumentin interfacePOIXMLTextExtractor- Returns:
- the opened document
-
setCloseFilesystem
public void setCloseFilesystem(boolean doCloseFilesystem) - Specified by:
setCloseFilesystemin interfacePOITextExtractor- Parameters:
doCloseFilesystem-true(default), if underlying resources/filesystem should be closed onPOITextExtractor.close()
-
isCloseFilesystem
public boolean isCloseFilesystem()- Specified by:
isCloseFilesystemin interfacePOITextExtractor- Returns:
true, if resources/filesystem should be closed onPOITextExtractor.close()
-
getFilesystem
- Specified by:
getFilesystemin interfacePOITextExtractor- Returns:
- The underlying resources/filesystem
-