Package org.apache.poi.hssf.extractor
Class OldExcelExtractor
- java.lang.Object
-
- org.apache.poi.hssf.extractor.OldExcelExtractor
-
- All Implemented Interfaces:
Closeable,AutoCloseable
public class OldExcelExtractor extends Object implements Closeable
A text extractor for old Excel files, which are too old for HSSFWorkbook to handle. This includes Excel 95, and very old (pre-OLE2) Excel files, such as Excel 4 files.Returns much (but not all) of the textual content of the file, suitable for indexing by something like Apache Lucene, or used by Apache Tika, but not really intended for display to the user.
-
-
Constructor Summary
Constructors Constructor Description OldExcelExtractor(File f)OldExcelExtractor(InputStream input)OldExcelExtractor(DirectoryNode directory)OldExcelExtractor(POIFSFileSystem fs)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclose()intgetBiffVersion()The Biff version, largely corresponding to the Excel versionintgetFileType()The kind of the file, one ofBOFRecord.TYPE_WORKSHEET,BOFRecord.TYPE_CHART,BOFRecord.TYPE_EXCEL_4_MACROorBOFRecord.TYPE_WORKSPACE_FILEStringgetText()Retrieves the text contents of the file, as best we can for these old file formatsstatic voidmain(String[] args)
-
-
-
Constructor Detail
-
OldExcelExtractor
public OldExcelExtractor(InputStream input) throws IOException
- Throws:
IOException
-
OldExcelExtractor
public OldExcelExtractor(File f) throws IOException
- Throws:
IOException
-
OldExcelExtractor
public OldExcelExtractor(POIFSFileSystem fs) throws IOException
- Throws:
IOException
-
OldExcelExtractor
public OldExcelExtractor(DirectoryNode directory) throws IOException
- Throws:
IOException
-
-
Method Detail
-
main
public static void main(String[] args) throws IOException
- Throws:
IOException
-
getBiffVersion
public int getBiffVersion()
The Biff version, largely corresponding to the Excel version- Returns:
- the Biff version
-
getFileType
public int getFileType()
The kind of the file, one ofBOFRecord.TYPE_WORKSHEET,BOFRecord.TYPE_CHART,BOFRecord.TYPE_EXCEL_4_MACROorBOFRecord.TYPE_WORKSPACE_FILE- Returns:
- the file type
-
getText
public String getText()
Retrieves the text contents of the file, as best we can for these old file formats- Returns:
- the text contents of the file
-
close
public void close()
- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable
-
-