Class XmlCharsetDetector
- java.lang.Object
-
- org.hortonmachine.gears.utils.style.sld.XmlCharsetDetector
-
public class XmlCharsetDetector extends Object
Provides a methods that can be used to detect charset of some XML document and (optionally) return a reader that is aware of this charset and can correctly decode document's data.
-
-
Constructor Summary
Constructors Constructor Description XmlCharsetDetector()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static ReadercreateReader(InputStream istream, EncodingInfo encInfo)Creates a new reader on top of the givenInputStreamusing existing (external) encoding information.static ReadergetCharsetAwareReader(InputStream istream)Use this variant when you aren't interested in encoding data, and just want to get a suitable reader for incoming request.static ReadergetCharsetAwareReader(InputStream istream, EncodingInfo encInfo)Based on Xerces-J code, this method will try its best to return a reader which is able to decode content of incoming XML document properly.static EncodingInfogetEncodingName(byte[] b4, int count)Returns the IANA encoding name that is auto-detected from the bytes specified, with the endian-ness of that encoding where appropriate.protected static StringgetXmlEncoding(Reader reader)Gets the encoding of the xml request made to the dispatcher.
-
-
-
Field Detail
-
LOGGER
protected static Logger LOGGER
-
-
Method Detail
-
getCharsetAwareReader
public static Reader getCharsetAwareReader(InputStream istream, EncodingInfo encInfo) throws IOException, UnsupportedCharsetException
Based on Xerces-J code, this method will try its best to return a reader which is able to decode content of incoming XML document properly. To achieve this goal, it first infers general encoding scheme of the above document and then uses this information to extract actual charset from XML declaration. In any recoverable error situation default UTF-8 reader will be created.- Parameters:
istream- Byte stream (most probably obtained withHttpServletRequest.getInputStreamthat gives access to XML document in question).encInfo- Instance of EncodingInfo where information about detected charset will be stored. You can then use it, for example, to form a response encoded with this charset.- Throws:
IOException- in case of any unrecoverable I/O errors.UnsupportedCharsetException-InputStreamReader's constructor will probably throw this exception if inferred charset of XML document is not supported by current JVM.
-
getCharsetAwareReader
public static Reader getCharsetAwareReader(InputStream istream) throws IOException, UnsupportedCharsetException
Use this variant when you aren't interested in encoding data, and just want to get a suitable reader for incoming request.- Parameters:
istream- SeegetCharsetAwareReader(InputStream, EncodingInfo).- Throws:
IOExceptionUnsupportedCharsetException
-
createReader
public static Reader createReader(InputStream istream, EncodingInfo encInfo) throws IllegalArgumentException, UnsupportedEncodingException
Creates a new reader on top of the givenInputStreamusing existing (external) encoding information. UnlikegetCharsetAwareReader, this method never tries to detect charset or encoding scheme ofInputStream's data. This also means that it must be provided with validEncodingInfoinstance, which may be obtained, for example, from previousgetCharsetAwareReader(InputStream, EncodingInfo)call.- Parameters:
istream- byte-stream containing textual (presumably XML) dataencInfo- correctly initialized object which holds information of the above byte-stream's contents charset.- Throws:
IllegalArgumentException- if charset name is not specifiedUnsupportedEncodingException- in cases when specified charset is not supported by platform or due to invalid byte order forISO-10646-UCS-2|4charsets.
-
getEncodingName
public static EncodingInfo getEncodingName(byte[] b4, int count)
Returns the IANA encoding name that is auto-detected from the bytes specified, with the endian-ness of that encoding where appropriate. Note, that encoding obtained this way is only an encoding scheme of the request, i.e. step 1 of detection process. To learn the exact charset of the request data, you should also perform step 2 - read XML declaration and get the value of itsencodingpseudoattribute.- Parameters:
b4- The first four bytes of the input.count- The number of bytes actually read.- Returns:
- Instance of EncodingInfo incapsulating all encoding-related data.
-
getXmlEncoding
protected static String getXmlEncoding(Reader reader)
Gets the encoding of the xml request made to the dispatcher. This works by reading the temp file where we are storing the request, looking to match the header specified encoding that should be present on all xml files. This call should only be made after the temp file has been set. If no encoding is found, or if an IOError is encountered then null shall be returned.- Parameters:
reader- This character stream is supposed to contain XML data (i.e. it should start with valid XML declaration).- Returns:
- The encoding specified in the xml header read from the supplied character stream.
-
-