package charset
- Alphabetic
- Public
- All
Type Members
-
trait
BitsCharset extends Serializable
Charset enhanced with features allowing it to work with Daffodil's Bit-wise DataInputStream and DataOutputStream.
Charset enhanced with features allowing it to work with Daffodil's Bit-wise DataInputStream and DataOutputStream.
Daffodil uses BitsCharset as its primary abstraction for dealing with character sets, which enables it to support character sets where the code units are smaller than 1 byte.
Note that BitsCharset is NOT derived from java.nio.charset.Charset, nor are BitsCharsetDecoder or BitsCharsetEncoder derived from java.nio.charset.CharsetDecoder or CharsetEncoder respectively. This is partly because these Java classes have many final methods that make it impossible for us to implement what we need by extending them. But more importantly, we need much more low level control about how characters are decoded what what kind of information is returned during decode operations. Getting that information with the limitations of the java Charset API become an encumbrance. Replacing with our own Charset decoders grealy simplifies the code and allows for future enhancements as needed.
- final class BitsCharset3BitDFI336DUI001Definition extends BitsCharsetDefinition
- final class BitsCharset3BitDFI746DUI002Definition extends BitsCharsetDefinition
- final class BitsCharset3BitDFI747DUI001Definition extends BitsCharsetDefinition
- final class BitsCharset4BitDFI746DUI002Definition extends BitsCharsetDefinition
- final class BitsCharset5BitDFI1661DUI001Definition extends BitsCharsetDefinition
- final class BitsCharset5BitDFI769DUI002Definition extends BitsCharsetDefinition
- final class BitsCharset5BitPackedLSBFDefinition extends BitsCharsetDefinition
- final class BitsCharset6BitDFI264DUI001Definition extends BitsCharsetDefinition
- sealed abstract class BitsCharset6BitDFI311DUI002Base extends BitsCharsetNonByteSize
- final class BitsCharset6BitDFI311DUI002Definition extends BitsCharsetDefinition
- final class BitsCharset6BitICAOAircraftIDDefinition extends BitsCharsetDefinition
- final class BitsCharsetAISPayloadArmoringDefinition extends BitsCharsetDefinition
- final class BitsCharsetASCIIDefinition extends BitsCharsetDefinition
- final class BitsCharsetBase4LSBFDefinition extends BitsCharsetDefinition
- final class BitsCharsetBase4MSBFDefinition extends BitsCharsetDefinition
- final class BitsCharsetBinaryLSBFDefinition extends BitsCharsetDefinition
- final class BitsCharsetBinaryMSBFDefinition extends BitsCharsetDefinition
- abstract class BitsCharsetDecoder extends AnyRef
-
abstract
class
BitsCharsetDecoderByteSize extends BitsCharsetDecoder
Base class for byte based decoders
Base class for byte based decoders
Provides methods to get a single byte. Also handles logic related to error encoding policy and the replacement characters. Implementing class only need to use the provided methods to get a byte(s) and convert to a char and perform validation on the code point.
-
abstract
class
BitsCharsetDecoderCreatesSurrogates extends BitsCharsetDecoderByteSize
Some encodings need state, but only for the storing of a low surrogate pair.
Some encodings need state, but only for the storing of a low surrogate pair. This encapsulates that logic. When a class extends this class, it ust implement deocodeOneUnicodeChar, which should decode one char, and if there is a high/low surrogate pair it should call setLowSurrgoate on the low and return the high.
- class BitsCharsetDecoderIBM037 extends BitsCharsetDecoderByteSize
- class BitsCharsetDecoderISO88591 extends BitsCharsetDecoderByteSize
- class BitsCharsetDecoderMalformedException extends ThinException
- trait BitsCharsetDecoderState extends AnyRef
- class BitsCharsetDecoderUSASCII extends BitsCharsetDecoderByteSize
- class BitsCharsetDecoderUTF16BE extends BitsCharsetDecoderCreatesSurrogates
- class BitsCharsetDecoderUTF16LE extends BitsCharsetDecoderCreatesSurrogates
- class BitsCharsetDecoderUTF32BE extends BitsCharsetDecoderCreatesSurrogates
- class BitsCharsetDecoderUTF32LE extends BitsCharsetDecoderCreatesSurrogates
- class BitsCharsetDecoderUTF8 extends BitsCharsetDecoderCreatesSurrogates
- class BitsCharsetDecoderUnalignedCharDecodeException extends ThinException
-
abstract
class
BitsCharsetDefinition extends AnyRef
These are the classes which must be dynamically loaded in order to add a charset implementation to Daffodil.
These are the classes which must be dynamically loaded in order to add a charset implementation to Daffodil. All charsets must implement this class and be added to the org.apache.daffodil.processors.charset.BitsCharsetDefinition file in daffodil-io/src/main/resources/META-INF/services. name() must return a fully capitalized string
- final class BitsCharsetEBCDIC_CP_USDefinition extends BitsCharsetDefinition
- abstract class BitsCharsetEncoder extends IsResetMixin
- final class BitsCharsetHexLSBFDefinition extends BitsCharsetDefinition
- final class BitsCharsetHexMSBFDefinition extends BitsCharsetDefinition
- final class BitsCharsetIBM037Definition extends BitsCharsetDefinition
- final class BitsCharsetISO885918BitPackedLSBFDefinition extends BitsCharsetDefinition
- final class BitsCharsetISO885918BitPackedMSBFDefinition extends BitsCharsetDefinition
- final class BitsCharsetISO88591Definition extends BitsCharsetDefinition
-
trait
BitsCharsetJava extends BitsCharset
Implements BitsCharset based on encapsulation of a regular JavaCharset.
-
trait
BitsCharsetNonByteSize extends BitsCharset
Some encodings are not byte-oriented.
Some encodings are not byte-oriented.
If we know the correspondence from integers to characters, and we can express that as a string, then everything else can be derived
This class is explicitly not a java.nio.charset.Charset. It is a BitsCharset, which is not a compatible type with a java.nio.charset.Charset on purpose so we don't confuse the two.
The problem is that java.nio.charset.Charset is designed in such a way that one cannot implement a proxy class that redirects methods to another class. This is due to all the final methods on the class.
So instead we do the opposite. We implement our own BitsCharset API, but implement the behavior in terms of a proxy JavaCharsetDecoder and proxy JavaCharsetEncoder that drive the decodeLoop and encodeLoop. This way we don't have to re-implement all the error handling and flush/end logic.
- final class BitsCharsetNonByteSizeDecoder extends BitsCharsetDecoder
- final class BitsCharsetNonByteSizeEncoder extends BitsCharsetEncoder
- final class BitsCharsetOctalLSBFDefinition extends BitsCharsetDefinition
- final class BitsCharsetOctalMSBFDefinition extends BitsCharsetDefinition
- final class BitsCharsetUSASCII6BitPackedDefinition extends BitsCharsetDefinition
- final class BitsCharsetUSASCII6BitPackedLSBFDefinition extends BitsCharsetDefinition
- final class BitsCharsetUSASCII6BitPackedMSBFDefinition extends BitsCharsetDefinition
- final class BitsCharsetUSASCII7BitPackedDefinition extends BitsCharsetDefinition
- final class BitsCharsetUSASCIIDefinition extends BitsCharsetDefinition
- final class BitsCharsetUTF16BEDefinition extends BitsCharsetDefinition
- final class BitsCharsetUTF16Definition extends BitsCharsetDefinition
- final class BitsCharsetUTF16LEDefinition extends BitsCharsetDefinition
- final class BitsCharsetUTF32BEDefinition extends BitsCharsetDefinition
- final class BitsCharsetUTF32Definition extends BitsCharsetDefinition
- final class BitsCharsetUTF32LEDefinition extends BitsCharsetDefinition
- final class BitsCharsetUTF8Definition extends BitsCharsetDefinition
-
final
class
BitsCharsetWrappingJavaCharsetEncoder extends BitsCharsetEncoder
Implements BitsCharsetEncoder by encapsulating a standard JavaCharsetEncoder
- class CharacterSetAlignmentError extends Exception
- sealed abstract class CoderInfo extends AnyRef
- case class DecoderInfo(coder: BitsCharsetDecoder, encodingMandatoryAlignmentInBitsArg: Int, maybeCharWidthInBitsArg: MaybeInt) extends CoderInfo with Product with Serializable
- trait EncoderDecoderMixin extends LocalBufferMixin
- case class EncoderInfo(coder: BitsCharsetEncoder, replacingCoder: BitsCharsetEncoder, reportingCoder: BitsCharsetEncoder, encodingMandatoryAlignmentInBitsArg: Int, maybeCharWidthInBitsArg: MaybeInt) extends CoderInfo with Product with Serializable
- trait IsResetMixin extends AnyRef
-
final
class
ProxyJavaCharsetEncoder extends CharsetEncoder
Hyjack a JavaCharsetEncoder to drive the encodeLoop.
Hyjack a JavaCharsetEncoder to drive the encodeLoop.
This avoids us reimplementing all the error handling and flush/end logic.
TODO: Similar to our decoders, we should create custom encoders. Then we wouldn't need all this complex code related to proxying java charsets.
- Attributes
- protected
Value Members
- object BitsCharset3BitDFI336DUI001 extends BitsCharsetNonByteSize
- object BitsCharset3BitDFI746DUI002 extends BitsCharsetNonByteSize
- object BitsCharset3BitDFI747DUI001 extends BitsCharsetNonByteSize
- object BitsCharset4BitDFI746DUI002 extends BitsCharsetNonByteSize
- object BitsCharset5BitDFI1661DUI001 extends BitsCharsetNonByteSize
- object BitsCharset5BitDFI769DUI002 extends BitsCharsetNonByteSize
-
object
BitsCharset5BitPackedLSBF extends BitsCharsetNonByteSize
X-DFDL-5-BIT-PACKED-LSBF occupies only 5 bits with each code unit.
-
object
BitsCharset6BitDFI264DUI001 extends BitsCharsetNonByteSize
X-DFDL-6-BIT-DFI-264-DUI-001, special 6 bit encoding
- object BitsCharset6BitDFI311DUI002 extends BitsCharset6BitDFI311DUI002Base
- object BitsCharset6BitICAOAircraftID extends BitsCharset6BitDFI311DUI002Base
-
object
BitsCharsetAISPayloadArmoring extends BitsCharsetNonByteSize
Special purpose.
Special purpose. This is not used for decoding anything. The encoder is used to convert strings using the characters allowed, into binary data using the AIS Payload Armoring described here:
http://catb.org/gpsd/AIVDM.html#_aivdm_aivdo_payload_armoring
To convert a string of length N bytes, You will get 6N bits.
The decoder can be used for unit testing, but the point of this class is to make the encoder available for use in un-doing the AIS Payload armoring when parsing, and performing this armoring when unparsing.
When encoding from 8-bit say, ascii, or iso-8859-1, this can only encode things that stay within the 64 allowed characters. dfdl:encodingErrorPolicy='error' would check this (once implemented), otherwise where this is used the checking needs to be done separately somehow.
-
object
BitsCharsetBase4LSBF extends BitsCharsetNonByteSize
Base 4 aka Quarternary
- object BitsCharsetBase4MSBF extends BitsCharsetNonByteSize
-
object
BitsCharsetBinaryLSBF extends BitsCharsetNonByteSize
X-DFDL-BITS-LSBF occupies only 1 bit with each code unit.
-
object
BitsCharsetBinaryMSBF extends BitsCharsetNonByteSize
X-DFDL-BITS-MSBF occupies only 1 bit with each code unit.
- object BitsCharsetDefinitionRegistry
-
object
BitsCharsetHexLSBF extends BitsCharsetNonByteSize
X-DFDL-HEX-LSBF occupies only 4 bits with each code unit.
-
object
BitsCharsetHexMSBF extends BitsCharsetNonByteSize
X-DFDL-HEX-MSBF occupies only 4 bits with each code unit.
- object BitsCharsetIBM037 extends BitsCharsetJava
- object BitsCharsetISO88591 extends BitsCharsetJava
-
object
BitsCharsetISO885918BitPackedLSBF extends BitsCharsetNonByteSize
X-DFDL-ISO-88591-8-BIT-PACKED-LSB-FIRST occupies only 8 bits with each code unit.
-
object
BitsCharsetISO885918BitPackedMSBF extends BitsCharsetNonByteSize
X-DFDL-ISO-88591-8-BIT-PACKED-MSB-FIRST occupies only 8 bits with each code unit.
-
object
BitsCharsetOctalLSBF extends BitsCharsetNonByteSize
X-DFDL-OCTAL-LSBF occupies only 3 bits with each code unit.
-
object
BitsCharsetOctalMSBF extends BitsCharsetNonByteSize
X-DFDL-OCTAL-MSBF occupies only 3 bits with each code unit.
- object BitsCharsetUSASCII extends BitsCharsetJava
-
object
BitsCharsetUSASCII6BitPackedLSBF extends BitsCharsetNonByteSize
X-DFDL-US-ASCII-6-BIT-PACKED occupies only 6 bits with each code unit.
- object BitsCharsetUSASCII6BitPackedMSBF extends BitsCharsetNonByteSize
-
object
BitsCharsetUSASCII7BitPacked extends BitsCharsetNonByteSize
X-DFDL-US-ASCII-7-BIT-PACKED occupies only 7 bits with each code unit.
- object BitsCharsetUTF16BE extends BitsCharsetJava
- object BitsCharsetUTF16LE extends BitsCharsetJava
- object BitsCharsetUTF32BE extends BitsCharsetJava
- object BitsCharsetUTF32LE extends BitsCharsetJava
- object BitsCharsetUTF8 extends BitsCharsetJava
- object CharsetUtils
-
object
StandardBitsCharsets
Provides BitsCharset objects corresponding to the usual java charsets found in StandardCharsets.