Package org.dishevelled.bio.sequence
Class Sequences
java.lang.Object
org.dishevelled.bio.sequence.Sequences
Utility methods on sequences.
- Since:
- 1.1
- Author:
- Michael Heuer
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic Stringdecode(ByteBuffer bytes, int length) Decode the specified byte buffer as an unambiguous DNA sequence the specified length as a string.static <T extends Appendable>
Tdecode(ByteBuffer bytes, int length, T appendable) Decode the specified byte buffer as an unambiguous DNA sequence the specified length to the specified appendable.static StringdecodeWithAmbiguity(ByteBuffer bytes, int length) Decode the specified byte buffer as a DNA sequence with ambiguity symbols the specified length as a string.static <T extends Appendable>
TdecodeWithAmbiguity(ByteBuffer bytes, int length, T appendable) Decode the specified byte buffer as a DNA sequence with ambiguity symbols the specified length to the specified appendable.static StringdecodeWithNs(ByteBuffer bytes, int length) Decode the specified byte buffer as a DNA sequence with N ambiguity symbols the specified length as a string.static <T extends Appendable>
TdecodeWithNs(ByteBuffer bytes, int length, T appendable) Decode the specified byte buffer as a DNA sequence with N ambiguity symbols the specified length to the specified appendable.static ByteBufferEncode the specified unambiguous DNA sequence to a new byte buffer.static ByteBufferencode(String sequence, ByteBuffer bytes) Encode the specified unambiguous DNA sequence to the specified byte buffer.static ByteBufferencodeWithAmbiguity(String sequence) Encode the specified DNA sequence with ambiguity symbols to a new byte buffer.static ByteBufferencodeWithAmbiguity(String sequence, ByteBuffer bytes) Encode the specified DNA sequence with ambiguity symbols to the specified byte buffer.static ByteBufferencodeWithNs(String sequence) Encode the specified DNA sequence with N ambiguity symbols to a new byte buffer.static ByteBufferencodeWithNs(String sequence, ByteBuffer bytes) Encode the specified DNA sequence with N ambiguity symbols to the specified byte buffer.(package private) static StringformatBits(byte b)
-
Constructor Details
-
Sequences
public Sequences()
-
-
Method Details
-
decode
Decode the specified byte buffer as an unambiguous DNA sequence the specified length as a string.- Parameters:
bytes- byte buffer, must not be nulllength- length, must be at least 0- Returns:
- the specified byte buffer decoded as an unambiguous DNA sequence the specified length as a string
- Throws:
IOException- if an I/O error occurs- See Also:
-
decode
public static <T extends Appendable> T decode(ByteBuffer bytes, int length, T appendable) throws IOException Decode the specified byte buffer as an unambiguous DNA sequence the specified length to the specified appendable.- Type Parameters:
T- appendable type- Parameters:
bytes- byte buffer, must not be nulllength- length, must be at least 0appendable- appendable to decode to, must not be null- Returns:
- the specified byte buffer decoded as an unambiguous DNA sequence the specified length to the specified appendable
- Throws:
IOException- if an I/O error occurs- See Also:
-
encode
Encode the specified unambiguous DNA sequence to a new byte buffer. Valid unambiguous DNA sequence symbols are{ A, C, G, T, a, c, g, t }. Similar to twoBit format the DNA symbols are packed to two bits per base, represented as so: T - 00, C - 01, A - 10, G - 11. The first base is in the most significant 2-bit byte; the last base is in the least significant 2 bits. For example, the sequence TCAG is represented as 00011011.- Parameters:
sequence- unambiguous DNA sequence to encode, must not be null- Returns:
- the specified unambiguous DNA sequence encoded to a new byte buffer
- Throws:
IllegalArgumentException- if the specified sequence contains any ambiguity symbols
-
encode
Encode the specified unambiguous DNA sequence to the specified byte buffer. Valid unambiguous DNA sequence symbols are{ A, C, G, T, a, c, g, t }. Similar to twoBit format the DNA symbols are packed to two bits per base, represented as so: T - 00, C - 01, A - 10, G - 11. The first base is in the most significant 2-bit byte; the last base is in the least significant 2 bits. For example, the sequence TCAG is represented as 00011011.- Parameters:
sequence- unambiguous DNA sequence to encode, must not be nullbytes- byte buffer, must not be null- Returns:
- the specified unambiguous DNA sequence encoded to the specified byte buffer
- Throws:
IllegalArgumentException- if the specified sequence contains any ambiguity symbols
-
decodeWithNs
Decode the specified byte buffer as a DNA sequence with N ambiguity symbols the specified length as a string.- Parameters:
bytes- byte buffer, must not be nulllength- length, must be at least 0- Returns:
- the specified byte buffer decoded as a DNA sequence with N ambiguity symbols the specified length as a string
- Throws:
IOException- if an I/O error occurs- See Also:
-
decodeWithNs
public static <T extends Appendable> T decodeWithNs(ByteBuffer bytes, int length, T appendable) throws IOException Decode the specified byte buffer as a DNA sequence with N ambiguity symbols the specified length to the specified appendable.- Type Parameters:
T- appendable type- Parameters:
bytes- byte buffer, must not be nulllength- length, must be at least 0appendable- appendable to decode to, must not be null- Returns:
- the specified byte buffer decoded as a DNA sequence with N ambiguity symbols the specified length to the specified appendable
- Throws:
IOException- if an I/O error occurs- See Also:
-
encodeWithNs
Encode the specified DNA sequence with N ambiguity symbols to a new byte buffer. Valid DNA sequence with N ambiguity symbols are{ A, C, G, T, N, a, c, g, t, n }. Similar to .nib format the DNA symbols are packed two bases to the byte. The first base is packed in the high-order 4 bits (nibble); the second base is packed in the low-order four bits:byte = (base0<<4) + base1. The numerical representations for the bases are T - 0, C - 1, A - 2, G - 3, N - 4.- Parameters:
sequence- DNA sequence with N ambiguity symbols to encode, must not be null- Returns:
- the specified DNA sequence with N ambiguity symbols encoded to a new byte buffer
- Throws:
IllegalArgumentException- if the specified sequence contains any ambiguity symbols other than { N, n }
-
encodeWithNs
Encode the specified DNA sequence with N ambiguity symbols to the specified byte buffer. Valid DNA sequence with N ambiguity symbols are{ A, C, G, T, N, a, c, g, t, n }. Similar to .nib format the DNA symbols are packed two bases to the byte. The first base is packed in the high-order 4 bits (nibble); the second base is packed in the low-order four bits:byte = (base0<<4) + base1. The numerical representations for the bases are T - 0, C - 1, A - 2, G - 3, N - 4.- Parameters:
sequence- DNA sequence with N ambiguity symbols to encode, must not be nullbytes- byte buffer, must not be null- Returns:
- the specified DNA sequence with N ambiguity symbols encoded to the specified byte buffer
- Throws:
IllegalArgumentException- if the specified sequence contains any ambiguity symbols other than{ N, n }
-
decodeWithAmbiguity
Decode the specified byte buffer as a DNA sequence with ambiguity symbols the specified length as a string.- Parameters:
bytes- byte buffer, must not be nulllength- length, must be at least 0- Returns:
- the specified byte buffer decoded as a DNA sequence with ambiguity symbols the specified length as a string
- Throws:
IOException- if an I/O error occurs- Since:
- 1.2
- See Also:
-
decodeWithAmbiguity
public static <T extends Appendable> T decodeWithAmbiguity(ByteBuffer bytes, int length, T appendable) throws IOException Decode the specified byte buffer as a DNA sequence with ambiguity symbols the specified length to the specified appendable.- Type Parameters:
T- appendable type- Parameters:
bytes- byte buffer, must not be nulllength- length, must be at least 0appendable- appendable to decode to, must not be null- Returns:
- the specified byte buffer decoded as a DNA sequence with ambiguity symbols the specified length to the specified appendable
- Throws:
IOException- if an I/O error occurs- Since:
- 1.2
- See Also:
-
encodeWithAmbiguity
Encode the specified DNA sequence with ambiguity symbols to a new byte buffer. Per the BAM specification, ambiguity symbols{ =, A, a, C, c, M, m, G, g, R, r, S, s, V, v, T, t, W, w, Y, y, H, h, K, k, D, d, B, b, N, n }are mapped to bytes in the range[0, 15], with other characters mapped toN; high nibble first (1st symbol in the highest 4-bit of the 1st byte).- Parameters:
sequence- DNA sequence with ambiguity symbols to encode, must not be null- Returns:
- the specified DNA sequence with ambiguity symbols encoded to a new byte buffer
- Since:
- 1.2
-
encodeWithAmbiguity
Encode the specified DNA sequence with ambiguity symbols to the specified byte buffer. Per the BAM specification, ambiguity symbols{ =, A, a, C, c, M, m, G, g, R, r, S, s, V, v, T, t, W, w, Y, y, H, h, K, k, D, d, B, b, N, n }are mapped to bytes in the range[0, 15], with other characters mapped toN; high nibble first (1st symbol in the highest 4-bit of the 1st byte).- Parameters:
sequence- DNA sequence with ambiguity symbols to encode, must not be nullbytes- byte buffer, must not be null- Returns:
- the specified DNA sequence with ambiguity symbols encoded to the specified byte buffer
- Since:
- 1.2
-
formatBits
-