public class FILLINImputationUtils
Basic utility functions to support imputation by blocks.
public static kotlin.Array[] calcAllelePresenceCountsBtwTargetAndDonors(BitSet[] modBitsOfTarget, GenotypeTable donorAlign)
Counts union and intersection of major and minor alleles between the target genotype and potential donor genotypes. The counts are done by 64 sites block. These counts can be quickly used to estimate distance for the set of blocks.
modBitsOfTarget - major and minor presence bits for target genotype (must be aligned same as donor)donorAlign - genotypeTable with potential donor genotypespublic static DonorHypoth[] findHomozygousDonorHypoth(int targetTaxon, int firstBlock, int lastBlock, int focusBlock, kotlin.Array[] donor1indices, kotlin.Array[] targetToDonorDistances, int minTestSites, int maxDonorHypotheses)
Simple algorithm that tests every possible haplotype as a homozygous donor to minimize the number of unmatched informative alleles. Currently, there is little tie breaking, longer matches are favored by add 0.5 errors to the error rate calculation within DonorHypoth. Distance calculation are made between first and last block. This code is relatively fast compared to calculating distance as it works with the precomputed distances.
targetTaxon - index of target taxon only used to annotated DonorHypothfirstBlock - index of first 64 site blocklastBlock - inclusive index of last 64 site blockfocusBlock - index of the focus block (only used for annotation of DonorHypoth)donor1indices - array of donor indices to teststargetToDonorDistances - precomputed block distancesminTestSites - minimum number of comparable sites to be included the analysismaxDonorHypotheses - maximum number of donors to retainpublic static kotlin.Array[] mostFrequentDonorsAcrossFocusBlocks(DonorHypoth[] allDH, int maxHypotheses)
Produces a sort list of most prevalent donors across the donorAlignment. It looks at all focus blocks and weights blocks by rank
allDH - homozygous donor hypotheses across entire regionmaxHypotheses - min number of focus blocks that the hypotheses needs to show up inpublic static kotlin.Array[] bestDonorsAcrossEntireRegion(kotlin.Array[] targetToDonorDistances,
int minTestSites,
int maxDonorHypotheses)
Produces a sort list of most prevalent donorHypotheses across the donorAlign. It looks at all focus blocks and weight blocks by rank
targetToDonorDistances - precomputed block distancesminTestSites - minimum number of comparable sites to be included the analysismaxDonorHypotheses - maximum number of donor hypotheses to retainpublic static int sumOf(int... integers)
public static int sumOf(byte... integers)
public static kotlin.Array[] fillInc(int first,
int last)
public static DonorHypoth[] findHeterozygousDonorHypoth(int targetTaxon, kotlin.Array[] mjT, kotlin.Array[] mnT, int firstBlock, int lastBlock, int focusBlock, GenotypeTable donorAlign, int d1, kotlin.Array[] donor2Indices, int maxDonorHypotheses, int minTestSites)
Simple algorithm that does a one dimensional test of donors. A single donor with a list of others donor combinations to minimize the number of unmatched informative alleles.
targetTaxon - index of target taxon only used to annotated DonorHypothmjT - masked bitset for the major allelemnT - masked bitset for the minor allelefirstBlock - index of first 64 site blocklastBlock - inclusive index of last 64 site blockfocusBlock - index of the focus block (only used for annotation of DonorHypoth)donorAlign - genotypeTable with potential donor genotypesd1 - fixed donordonor2Indices - list of second potential donorsmaxDonorHypotheses - maximum number of donor hypotheses to retainminTestSites - minimum number of comparable sites to be included the analysispublic static DonorHypoth[] findHeterozygousDonorHypoth(int targetTaxon, kotlin.Array[] mjT, kotlin.Array[] mnT, int firstBlock, int lastBlock, int focusBlock, GenotypeTable donorAlign, kotlin.Array[] donor1Indices, kotlin.Array[] donor2Indices, int maxDonorHypotheses, int minTestSites)
Simple algorithm that does a two dimensional test of donors. A list donor versus with a list of others donor combinations to minimize the number of unmatched informative alleles.
targetTaxon - index of target taxon only used to annotated DonorHypothmjT - masked bitset for the major allelemnT - masked bitset for the minor allelefirstBlock - index of first 64 site blocklastBlock - inclusive index of last 64 site blockfocusBlock - index of the focus block (only used for annotation of DonorHypoth)donorAlign - genotypeTable with potential donor genotypesdonor1Indices - fixed donordonor2Indices - list of second potential donorsmaxDonorHypotheses - maximum number of donor hypotheses to retainminTestSites - minimum number of comparable sites to be included the analysispublic static DonorHypoth[] combineDonorHypothArrays(int maxDonorHypotheses, Object... dhs)
Combines arrays of donorHypoth, sorts them, and returns the best limited by maxDonorHypotheses
maxDonorHypotheses - maximum number of donor hypotheses to retaindhs - arrays of DonorHypoth[]public static kotlin.Array[] getBlockWithMinMinorCount(kotlin.Array[] mjT,
kotlin.Array[] mnT,
int focusBlock,
int minMinorCnt,
int minMajorCnt)
Given a start 64 site block, it expands to the left and right until it hits the minimum Minor Site count or MajorSiteCount in the target taxon
mnT - - minor allele bit presence in a series of longsfocusBlock - index of the focus blockminMinorCnt - minimum count to stop expanding for minor allele sitesminMajorCnt - minimum count to stop expanding for major allele sitespublic static kotlin.Array[] mendelErrorComparison(kotlin.Array[] mjT,
kotlin.Array[] mnT,
kotlin.Array[] mj1,
kotlin.Array[] mn1,
kotlin.Array[] mj2,
kotlin.Array[] mn2)
Determines the number of sites in which the target (T) sequence cannot be explained by the genotypes of either donor (1 & 2). Only sites where the genotype for all taxa can be tested.
mjT - major allele bits of targetmnT - minor allele bits of targetmj1 - major allele bits of donor 1mn1 - minor allele bits of donor 1mj2 - major allele bits of donor 2mn2 - minor allele bits of donor 2public static kotlin.Array[] countUnknownAndHeterozygotes(kotlin.Array[] a)
Sums the number of unknown and heterozgyous sites in a byte genotype
a - a byte genotype