public class Latin1StringsParser
extends org.apache.tika.parser.AbstractParser
AutoDetectParser parser = new AutoDetectParser(); parser.setFallback(new Latin1StringsParser());
Currently the parser does a best effort to extract Latin1 strings, used by Western European languages, encoded with ISO-8859-1, UTF-8 or UTF-16 charsets mixed within the same file.The implementation is optimized for fast parsing with only one pass.
| Constructor and Description |
|---|
Latin1StringsParser() |
| Modifier and Type | Method and Description |
|---|---|
int |
getMinSize()
Returns the minimum size of a character sequence to be extracted.
|
Set<org.apache.tika.mime.MediaType> |
getSupportedTypes(org.apache.tika.parser.ParseContext arg0) |
void |
parse(InputStream stream,
ContentHandler handler,
org.apache.tika.metadata.Metadata metadata,
org.apache.tika.parser.ParseContext context) |
void |
setMinSize(int minSize)
Sets the minimum size of a character sequence to be extracted.
|
public int getMinSize()
public void setMinSize(int minSize)
minSize - the minimum size of a character sequencepublic Set<org.apache.tika.mime.MediaType> getSupportedTypes(org.apache.tika.parser.ParseContext arg0)
public void parse(InputStream stream, ContentHandler handler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context) throws IOException, SAXException
IOExceptionSAXExceptionParser.parse(java.io.InputStream,
org.xml.sax.ContentHandler, org.apache.tika.metadata.Metadata,
org.apache.tika.parser.ParseContext)Copyright © 2007–2025 The Apache Software Foundation. All rights reserved.