Package org.apache.poi.hssf.extractor
Class OldExcelExtractor
java.lang.Object
org.apache.poi.hssf.extractor.OldExcelExtractor
- All Implemented Interfaces:
Closeable,AutoCloseable,POITextExtractor
A text extractor for old Excel files, which are too old for
HSSFWorkbook to handle. This includes Excel 95, and very old
(pre-OLE2) Excel files, such as Excel 4 files.
Returns much (but not all) of the textual content of the file, suitable for indexing by something like Apache Lucene, or used by Apache Tika, but not really intended for display to the user.
-
Constructor Summary
ConstructorsConstructorDescriptionOldExcelExtractor(InputStream input) OldExcelExtractor(DirectoryNode directory) -
Method Summary
Modifier and TypeMethodDescriptionintThe Biff version, largely corresponding to the Excel versionintThe kind of the file, one ofBOFRecord.TYPE_WORKSHEET,BOFRecord.TYPE_CHART,BOFRecord.TYPE_EXCEL_4_MACROorBOFRecord.TYPE_WORKSPACE_FILEReturns another text extractor, which is able to output the textual content of the document metadata / properties, such as author and title.getText()Retrieves the text contents of the file, as best we can for these old file formatsbooleanstatic voidvoidsetCloseFilesystem(boolean doCloseFilesystem) Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.poi.extractor.POITextExtractor
close
-
Constructor Details
-
OldExcelExtractor
- Throws:
IOException
-
OldExcelExtractor
- Throws:
IOException
-
OldExcelExtractor
- Throws:
IOException
-
OldExcelExtractor
- Throws:
IOException
-
-
Method Details
-
main
- Throws:
IOException
-
getBiffVersion
public int getBiffVersion()The Biff version, largely corresponding to the Excel version- Returns:
- the Biff version
-
getFileType
public int getFileType()The kind of the file, one ofBOFRecord.TYPE_WORKSHEET,BOFRecord.TYPE_CHART,BOFRecord.TYPE_EXCEL_4_MACROorBOFRecord.TYPE_WORKSPACE_FILE- Returns:
- the file type
-
getText
Retrieves the text contents of the file, as best we can for these old file formats- Specified by:
getTextin interfacePOITextExtractor- Returns:
- the text contents of the file
-
getMetadataTextExtractor
Description copied from interface:POITextExtractorReturns another text extractor, which is able to output the textual content of the document metadata / properties, such as author and title.- Specified by:
getMetadataTextExtractorin interfacePOITextExtractor- Returns:
- the metadata and text extractor
-
setCloseFilesystem
public void setCloseFilesystem(boolean doCloseFilesystem) - Specified by:
setCloseFilesystemin interfacePOITextExtractor- Parameters:
doCloseFilesystem-true(default), if underlying resources/filesystem should be closed onPOITextExtractor.close()
-
isCloseFilesystem
public boolean isCloseFilesystem()- Specified by:
isCloseFilesystemin interfacePOITextExtractor- Returns:
true, if resources/filesystem should be closed onPOITextExtractor.close()
-
getFilesystem
- Specified by:
getFilesystemin interfacePOITextExtractor- Returns:
- The underlying resources/filesystem
-
getDocument
- Specified by:
getDocumentin interfacePOITextExtractor- Returns:
- the processed document
-