public class ScoringMatrix
extends ScoringScheme
This class implements a scoring scheme based on a substitution matrix. It is useful to represent PAM and BLOSUM family of amino acids scoring matrices. Its constructor loads such matrices from a file (or any other character stream). The following is an extract of a BLOSUM62 scoring matrix file:
A R N D C Q E G H I L K M F P S T W Y V B Z X *
A 4 -1 -2 -2 0 -1 -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -3 -2 0 -2 -1 0 -4
R -1 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3 -1 0 -1 -4
...
B -2 -1 3 4 -3 0 1 -1 0 -3 -4 0 -3 -3 -2 0 -1 -4 -3 -3 4 1 -1 -4
Z -1 0 0 1 -3 3 4 -2 0 -3 -3 1 -1 -3 -1 0 -1 -3 -2 -2 1 4 -1 -4
X 0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2 0 0 -2 -1 -1 -1 -1 -1 -4
* -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 1
Matrices are expected to follow this format. They must have one row an one column for each defined character (not necessarily in the same order). Each row and column must start with a distinct character (no repetition) and all row characters must have a correspondent column, and vice versa.
Value at position (i,j) represent the score of substituting character of row i for character of column j. Insertion penalties are specified by the last row while deletion penalties must be located at the last column (both represented by the special character defined by the INDEL_CHAR constant). Note that it only supports an additive gap cost function. In case any of this rules are not followed, an exception InvalidScoringMatrixException exception is raised by the constructor.
If a scoring operation (substitution, insertion or deletion) involves a character not found in the matrix, an exception is raised.
protected static char INDEL_CHAR
The character that indicates the row and column for insertion and deletion penalties in the matrix.
protected static char COMMENT_CHAR
The character used to start a comment line in the scoring matrix file.
protected java.lang.String col_codes
Stores matrix column headers in the order they were found.
protected java.lang.String row_codes
Stores matrix row headers in the order they were found.
protected kotlin.Array[] matrix
Stores values for each operation (substitution, insertion or deletion) defined by this matrix.
protected int dimension
Dimension of the (squared) matrix.
protected int max_absolute_score
The maximum absolute score that this matrix can return for any substitution, deletion or insertion.
public ScoringMatrix(java.io.Reader input)
Creates a new instance of a substitution matrix loaded from the character stream. The case of characters is significant when subsequently computing their score.
input - character stream from where the matrix is readIOException - if an I/O operation fails when reading from inputInvalidScoringMatrixException - if the matrix does not comply with the specificationpublic ScoringMatrix(java.io.Reader input,
boolean case_sensitive)
Creates a new instance of a substitution matrix loaded from the character stream. If case_sensitive is true, the case of characters is significant when subsequently computing their score; otherwise the case is ignored.
input - character stream from where the matrix is readcase_sensitive - true if the case of characters must beIOException - if an I/O operation fails when reading from inputInvalidScoringMatrixException - if the matrix does not comply with the specificationpublic int scoreSubstitution(char a,
char b)
Returns the score of a substitution of character a for character b according to this scoring matrix.
a - first characterb - second charactera for bIncompatibleScoringSchemeException - if this substitution is not definedpublic int scoreInsertion(char a)
Returns the score of an insertion of character a according to this scoring matrix.
a - character to be insertedaIncompatibleScoringSchemeException - if this character is not recognisedpublic int scoreDeletion(char a)
Returns the score of a deletion of character a according to this scoring matrix.
a - character to be deletedaIncompatibleScoringSchemeException - if this character is not recognisedpublic boolean isPartialMatchSupported()
Tells whether this scoring scheme supports partial matches, which it does, although a particular scoring matrix loaded by this instace might not. A partial match is a situation when two characters are not equal but, for any reason, are regarded as similar by this scoring scheme, which then returns a positive score value. This is common for amino acid scoring matrices.
truepublic int maxAbsoluteScore()
Returns the maximum absolute score that this scoring scheme can return for any substitution, deletion or insertion.
public java.lang.String toString()
Returns a String representation of this scoring matrix.