public class GenotypeTableUtils
Utility methods for comparing, sorting, and counting genotypes.
public static kotlin.Array[] getAllelesSortedByFrequency(kotlin.Array[] data)
This sorts alleles by frequency. Each cell in the given array contains a diploid value which is separated and counted individually. Resulting double dimension array holds alleles (bytes) in result[0]. And the counts are in result[1]. Counts haploid values twice and diploid values once. Higher ploids are not supported.
data - datapublic static kotlin.Array[] getAllelesSortedByFrequency(kotlin.Array[] data,
int site)
This sorts alleles in a given site by frequency. Each cell in the given array contains a diploid value which is separated and counted individually. Resulting double dimension array holds alleles (bytes) in result[0]. And the counts are in result[1]. Counts haploid values twice and diploid values once. Higher ploids are not supported.
data - datasite - sitepublic static kotlin.Array[] getAlleles(kotlin.Array[] data,
int site)
public static java.lang.Object[] getAllelesSortedByFrequency(java.lang.String[] data,
int site)
public static java.util.List<java.lang.String> convertNucleotideGenotypesToStringList(kotlin.Array[] data)
Converts the byte representation of genotypes to string list of genotypes
data - public static java.util.List<java.lang.String> getAlleles(java.lang.String[] data,
int site)
public static kotlin.Array[] getAllelesSortedByFrequency(GenotypeTable alignment, int site)
public static java.lang.Object[] getDiploidsSortedByFrequency(GenotypeTable alignment, int site)
public static java.lang.String[] getAlleleStates(java.lang.String[] data,
int maxNumAlleles)
public static GenotypeTable removeSitesBasedOnFreqIgnoreMissing(GenotypeTable aa, double minimumProportion, double maximumProportion, int minimumCount)
remove sites based on minimum frequency (the count of good bases, INCLUDING GAPS) and based on the proportion of good alleles (including gaps) different from consensus
aa - the AnnotatedAlignment to filterminimumProportion - minimum proportion of sites different from the consensusminimumCount - minimum number of sequences with a good bases (not N or ?), where GAP IS CONSIDERED A GOOD BASEpublic static GenotypeTable filterSitesByBedFile(GenotypeTable input, java.lang.String bedFile, boolean includeSites)
This returns the subset of genotypes specified by the given BED file. Start positions are inclusive and end positions are exclusive. If includeSites is false, this returns everything except the subset specified by the BED file.
input - original genotypebedFile - BED file specifying subsetincludeSites - whether to include sitespublic static GenotypeTable filterSitesByChrPos(GenotypeTable input, PositionList positionList, boolean includeSites)
public static GenotypeTable filterSitesByChrPos(GenotypeTable input, java.lang.String filename, boolean includeSites)
public static GenotypeTable keepSitesChrPos(GenotypeTable input, java.lang.String chromosome, java.util.List<java.lang.Integer> position)
public static kotlin.Array[] getIncludedSitesBasedOnFreqIgnoreMissing(GenotypeTable aa, double minimumProportion, double maximumProportion, int minimumCount)
get sites to be included based on minimum frequency (the count of good bases, INCLUDING GAPS) and based on the proportion of good sites (INCLUDING GAPS) different from consensus
aa - the AnnotatedAlignment to filterminimumProportion - minimum proportion of sites different from the consensusmaximumProportion - maximum proportion of sites different from the consensusminimumCount - minimum number of sequences with a good base or a gap (but not N or ?)public static boolean isHeterozygous(byte diploidAllele)
Returns whether diploid allele values are heterozygous. First 4 bits in byte is one allele value. Second 4 bits is other allele value.
diploidAllele - allelespublic static boolean isHomozygous(byte diploidAllele)
Returns whether diploid allele values are homozygous. Unknown values return false.
diploidAllele - public static boolean isEqual(kotlin.Array[] alleles1,
kotlin.Array[] alleles2)
Returns whether two diploid allele values are equal ignoring order.
alleles1 - diploid alleles 1alleles2 - diploid alleles 2public static boolean isEqual(byte diploidAllele1,
byte diploidAllele2)
Returns whether two diploid allele values are equal ignoring order.
diploidAllele1 - diploid alleles 1diploidAllele2 - diploid alleles 2public static boolean isEqualOrUnknown(kotlin.Array[] alleles1,
kotlin.Array[] alleles2)
Returns whether two diploid allele values are equal ignoring order where unknown values equal anything.
alleles1 - diploid alleles 1alleles2 - diploid alleles 2public static boolean isEqualOrUnknown(byte diploidAllele1,
byte diploidAllele2)
Returns whether two diploid allele values are equal ignoring order where unknown values equal anything.
diploidAllele1 - diploid alleles 1diploidAllele2 - diploid alleles 2public static boolean isPartiallyEqual(byte genotype1,
byte genotype2)
Return true if either at least one allele agree
genotype1 - genotype2 - public static boolean areEncodingsEqual(java.lang.String[] encodings)
public static byte getDiploidValuePhased(byte a,
byte b)
Combines two allele values into one diploid value. Assumed phased.
a - allele 1b - allele 2public static byte getDiploidValue(byte a,
byte b)
Combines two allele values into one diploid value. Assumed phased.
a - allele 1b - allele 2public static byte getUnphasedDiploidValue(byte a,
byte b)
Combines two allele values into one diploid value. In alphabetical order
a - allele 1b - allele 2public static byte getUnphasedSortedDiploidValue(byte genotype)
Ensures diploid value in alphabetical order
genotype - diploid genotypepublic static byte getUnphasedDiploidValueNoHets(byte g1,
byte g2)
Combines two genotype values into one diploid value. Returns unknown if either parent is heterozygous or unknown, or alleles are swapped.
g1 - genotype 1g2 - genotype 2public static kotlin.Array[] getDiploidValues(byte genotype)
Separates diploid allele value into it's two values.
genotype - diploid valuepublic static BitSet[] calcBitPresenceFromGenotype(kotlin.Array[] genotype, kotlin.Array[] mjA, kotlin.Array[] mnA)
Method for getting TBits rapidly from major and minor allele arrays
genotype - mjA - mnA - public static BitSet calcBitPresenceOfDiploidValueFromGenotype(kotlin.Array[] genotype, byte diploidValue)
Returns BitSet indicating presence of given diploid value in genotype. This does unphased comparisons, so the order of the two allele values do not matter. If given diploid value is UNKNOWN, then it doesn't match anything. Bits set to 1 indicate match.
genotype - genotypediploidValue - diploid valuepublic static BitSet calcBitPresenceFromGenotype(kotlin.Array[] genotype, kotlin.Array[] referenceValues)
public static BitSet calcBitUnknownPresenceFromGenotype(kotlin.Array[] genotype)
public static BitSet[] calcBitPresenceFromGenotype15(kotlin.Array[] genotype, kotlin.Array[] mjA, kotlin.Array[] mnA)
Method for getting TBits rapidly from major and minor allele arrays
genotype - mjA - mnA - public static BitSet[] calcBitPresenceFromGenotype(kotlin.Array[] genotype, byte mj, byte mn)
Method for getting Site Bits rapidly from major and minor alleles
genotype - mj - mn - public static BitSet calcBitPresenceFromGenotype(kotlin.Array[] genotype, byte referenceValue)
public static kotlin.Array[] convertGenotypeToFloatProbability(GenotypeTable genotype, boolean sitesByTaxa)
public static kotlin.Array[] convertGenotypeToDoubleProbability(GenotypeTable genotype, boolean sitesByTaxa)