public class LinkageDisequilibrium implements TableReport
This class calculates D' and r^2 estimates of linkage disequilibrium. It also calculates the significance of the LD by either Fisher Exact or the multinomial permutation test. This class can work with either normal alignments of annotated alignments. The alignments should be stripped of invariable numSites.
sets matrix design for LD calculation. Either all by all, sliding window, site by all, or site list. enum LinkageDisequilibrium.testDesign
There are multiple approaches for dealing with heterozygous sites. sets the way these are treated. Haplotype assumes fully phased heterozygous sites (any hets are double counted). This is the best approach for speed when things are fully phased. Homozygous converted all hets to missing. Genotype does a 3x3 genotype analysis (to be implemented) enum LinkageDisequilibrium.HetTreatment
2 state estimates of D' and r^2 can be found reviewed and discussed in Weir 1996
Multi-state loci (>=3) require an averaging approach. In TASSEL 3 in 2010, Buckler removed these approach as the relative magnitudes and meaningfulness of these approaches has never been clear. Additionally with the moving away from SSR to SNPs these methods are less relevant. Researchers should convert to biallelic - either by ignoring rarer classes or collapsing rarer states.
TODO: Add 3x3 (genotype) mode.
public LinkageDisequilibrium(GenotypeTable alignment, int windowSize, net.maizegenetics.analysis.popgen.LinkageDisequilibrium.testDesign LDType, int testSite, ProgressListener listener, boolean isAccumulativeReport, int numAccumulateIntervals, kotlin.Array[] sitesList, net.maizegenetics.analysis.popgen.LinkageDisequilibrium.HetTreatment hetTreatment)
Constructor for doing LD analysis
alignment - Input alignment with segregating siteswindowSize - Size of sliding windowLDType - testSite - listener - isAccumulativeReport - numAccumulateIntervals - sitesList - hetTreatment - public void run()
starts the thread to calculate LD
public static LDResult calculateBitLDForHaplotype(boolean ignoreHets, int minTaxaForEstimate, GenotypeTable alignment, int site1, int site2)
public static LDResult calculateBitLDForHaplotype(int minTaxaForEstimate, int minorCnt, GenotypeTable alignment, int site1, int site2)
public static double calculateDPrime(int countAB,
int countAb,
int countaB,
int countab,
int minTaxaForEstimate)
public static double calculateRSqr(int countAB,
int countAb,
int countaB,
int countab,
int minTaxaForEstimate)
public static LDResult getLDForSitePair(BitSet rMj, BitSet rMn, BitSet cMj, BitSet cMn, int minMinorCnt, int minCnt, float minR2, FisherExact myFisherExact, int site1Index, int site2Index)
Method for estimating LD between a pair of bit sets. Since there can be tremendous missing data, minimum minor and minimum site counts ensure that meaningful results are estimated. Site indices are merely there for annotating the LDResult.
rMj - site 1 major allelesrMn - site 1 minor allelescMj - site 2 major allelescMn - site 2 minor allelesminMinorCnt - minimum minor allele count after intersectionminCnt - minimum count after intersectionminR2 - results below this r2 are ignored for p-value calculation (save times)myFisherExact - site1Index - annotation of LDresult with sites indicessite2Index - annotation of LDresult with sites indicespublic double getPVal(int r,
int c)
Returns P-value estimate for a given pair of numSites. If there were only 2 alleles at each locus, then the Fisher Exact P-value (one-tail) is returned. If more states then the permuted Monte Carlo test is used.
r - is site 1c - is site 2public int getSampleSize(int r,
int c)
Get number of gametes included in LD calculations (after missing data was excluded)
r - is site 1c - is site 2public float getDPrime(int r,
int c)
Returns D' estimate for a given pair of numSites
r - is site 1c - is site 2public float getRSqr(int r,
int c)
Returns r^2 estimate for a given pair of numSites
r - is site 1c - is site 2public int getX(int row)
public int getY(int row)
public int getSiteCount()
Returns the counts of the numSites in the alignment
public GenotypeTable getAlignment()
Returns an annotated aligment if one was used for this LD this could be used to access information of locus position
public java.lang.String toString()
Returns representation of the LD results as a string
public java.lang.Object[] getTableColumnNames()
public java.lang.Object[] getRow(long row)
public java.lang.String getTableTitle()
public int getColumnCount()
public long getRowCount()
public long getElementCount()
public java.lang.Object getValueAt(long row,
int col)
protected void fireProgress(int percent)