Package org.apache.poi.xssf.extractor
Class XSSFEventBasedExcelExtractor
java.lang.Object
org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor
- All Implemented Interfaces:
Closeable,AutoCloseable,POITextExtractor,POIXMLTextExtractor,ExcelExtractor
- Direct Known Subclasses:
XSSFBEventBasedExcelExtractor
public class XSSFEventBasedExcelExtractor
extends Object
implements POIXMLTextExtractor, ExcelExtractor
Implementation of a text extractor from OOXML Excel
files that uses SAX event based parsing.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionReturns the core document propertiesReturns the custom document propertiesReturns opened documentReturns the extended document propertiesbooleanbooleanbooleanbooleanbooleanReturns the opened OPCPackage container.getText()Processes the file and returns the textbooleanvoidprocessSheet(XSSFSheetXMLHandler.SheetContentsHandler sheetContentsExtractor, Styles styles, Comments comments, SharedStrings strings, InputStream sheetInputStream) Processes the given sheetvoidsetCloseFilesystem(boolean doCloseFilesystem) voidsetConcatenatePhoneticRuns(boolean concatenatePhoneticRuns) Concatenate text from <rPh> text elements in SharedStringsTable Default is true;voidsetFormulasNotResults(boolean formulasNotResults) Should we return the formula itself, and not the result it produces? Default is falsevoidsetIncludeCellComments(boolean includeCellComments) Should cell comments be included? Default is falsevoidsetIncludeHeadersFooters(boolean includeHeadersFooters) Should headers and footers be included? Default is truevoidsetIncludeSheetNames(boolean includeSheetNames) Should sheet names be included? Default is truevoidsetIncludeTextBoxes(boolean includeTextBoxes) Should text from textboxes be included? Default is truevoidMethods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.poi.ooxml.extractor.POIXMLTextExtractor
checkMaxTextSize, close, getMetadataTextExtractor
-
Constructor Details
-
XSSFEventBasedExcelExtractor
public XSSFEventBasedExcelExtractor(String path) throws XmlException, OpenXML4JException, IOException -
XSSFEventBasedExcelExtractor
public XSSFEventBasedExcelExtractor(OPCPackage container) throws XmlException, OpenXML4JException, IOException
-
-
Method Details
-
setIncludeSheetNames
public void setIncludeSheetNames(boolean includeSheetNames) Should sheet names be included? Default is true- Specified by:
setIncludeSheetNamesin interfaceExcelExtractor- Parameters:
includeSheetNames-trueif the sheet names should be included
-
getIncludeSheetNames
public boolean getIncludeSheetNames()- Returns:
- whether to include sheet names
- Since:
- 3.16-beta3
-
setFormulasNotResults
public void setFormulasNotResults(boolean formulasNotResults) Should we return the formula itself, and not the result it produces? Default is false- Specified by:
setFormulasNotResultsin interfaceExcelExtractor- Parameters:
formulasNotResults-trueif the formula itself is returned
-
getFormulasNotResults
public boolean getFormulasNotResults()- Returns:
- whether to include formulas but not results
- Since:
- 3.16-beta3
-
setIncludeTextBoxes
public void setIncludeTextBoxes(boolean includeTextBoxes) Should text from textboxes be included? Default is true -
getIncludeTextBoxes
public boolean getIncludeTextBoxes()- Returns:
- whether or not to extract textboxes
- Since:
- 3.16-beta3
-
setIncludeCellComments
public void setIncludeCellComments(boolean includeCellComments) Should cell comments be included? Default is false- Specified by:
setIncludeCellCommentsin interfaceExcelExtractor- Parameters:
includeCellComments-trueif cell comments should be included
-
getIncludeCellComments
public boolean getIncludeCellComments()- Returns:
- whether cell comments should be included
- Since:
- 3.16-beta3
-
setConcatenatePhoneticRuns
public void setConcatenatePhoneticRuns(boolean concatenatePhoneticRuns) Concatenate text from <rPh> text elements in SharedStringsTable Default is true;- Parameters:
concatenatePhoneticRuns- true if runs should be concatenated, false otherwise
-
setLocale
-
getLocale
- Returns:
- locale
- Since:
- 3.16-beta3
-
getPackage
Returns the opened OPCPackage container.- Specified by:
getPackagein interfacePOIXMLTextExtractor- Returns:
- the opened OPCPackage
-
getCoreProperties
Returns the core document properties- Specified by:
getCorePropertiesin interfacePOIXMLTextExtractor- Returns:
- the core document properties
-
getExtendedProperties
Returns the extended document properties- Specified by:
getExtendedPropertiesin interfacePOIXMLTextExtractor- Returns:
- the extended document properties
-
getCustomProperties
Returns the custom document properties- Specified by:
getCustomPropertiesin interfacePOIXMLTextExtractor- Returns:
- the custom document properties
-
getText
Processes the file and returns the text- Specified by:
getTextin interfaceExcelExtractor- Specified by:
getTextin interfacePOITextExtractor- Returns:
- All the text from the document
-
getDocument
Description copied from interface:POIXMLTextExtractorReturns opened document- Specified by:
getDocumentin interfacePOITextExtractor- Specified by:
getDocumentin interfacePOIXMLTextExtractor- Returns:
- the opened document
-
setCloseFilesystem
public void setCloseFilesystem(boolean doCloseFilesystem) - Specified by:
setCloseFilesystemin interfacePOITextExtractor- Parameters:
doCloseFilesystem-true(default), if underlying resources/filesystem should be closed onPOITextExtractor.close()
-
isCloseFilesystem
public boolean isCloseFilesystem()- Specified by:
isCloseFilesystemin interfacePOITextExtractor- Returns:
true, if resources/filesystem should be closed onPOITextExtractor.close()
-
getFilesystem
- Specified by:
getFilesystemin interfacePOITextExtractor- Returns:
- The underlying resources/filesystem
-