public class FILLINImputationPlugin
extends AbstractPlugin
FILLIN imputation relies on a libary of haplotypes and uses nearest neighbor searches followed by HMM Viterbi resolution or block-based resolution. It is the best approach for substantially unrelated taxa in TASSEL. BEAGLE4 is a better approach currently for landraces, while FILLIN outperforms if there is a good reference set of haplotypes.
The algorithm relies on having a series of donor haplotypes. If phased haplotypes are already known, they can be used directly. If they need to be reconstructed, the can be used to create haplotypes for windows across the genome. Imputation is done one taxon at a time using the donor haplotypes. The strategy is as follows: Every 64 sites is considered a block (or a word in the bitset terminology - length of a long). There is a separate initial search for nearest neighbors centered around each block. The flanks of the scan window vary but always contain the number of minor alleles specified (a key parameter). Minor alleles approximate the information content of the search. Calculate distance between the target taxon and the donor haplotypes for every window, and rank the top 10 donor haplotypes for each 64 site focus block. Evaluate by Viterbi whether 1 or 2 haplotypes will explain all the sites across the donor haplotype window. If successful, move to the next region for donor haplotypes. Resolve each focus block by nearest neighbors. If inbred NN are not close enough, then do a hybrid search seeded with one parent being the initial 10 haplotypes. Set bases based on what is resolved first inbred or hybrid. Error rates are bounded away from zero, but adding 0.5 error to all error rates that that were observed to be zero. class FindMergeHaplotypesPlugin
class FindMergeHaplotypesPluginpublic static GenotypeTable unimpAlign
public static GenotypeTable maskKeyAlign
public static kotlin.Array[] MAFClass
public FILLINImputationPlugin()
public FILLINImputationPlugin(java.awt.Frame parentFrame,
boolean isInteractive)
protected void postProcessParameters()
public java.lang.String getCitation()
public javax.swing.ImageIcon getIcon()
public java.lang.String getButtonName()
public java.lang.String getToolTipText()
public static void main(java.lang.String[] args)
public java.lang.String targetFile()
Input HapMap file of target genotypes to impute. Accepts all file types supported by TASSEL5
public FILLINImputationPlugin targetFile(java.lang.String value)
Set Target file. Input HapMap file of target genotypes to impute. Accepts all file types supported by TASSEL5
value - Target filepublic java.lang.String donorDir()
Directory containing donor haplotype files from output of FILLINFindHaplotypesPlugin. All files with '.gc' in the filename will be read in, only those with matching sites are used
public FILLINImputationPlugin donorDir(java.lang.String value)
Set Donor Dir. Directory containing donor haplotype files from output of FILLINFindHaplotypesPlugin. All files with '.gc' in the filename will be read in, only those with matching sites are used
value - Donor Dirpublic java.lang.String outputFilename()
Output file; hmp.txt.gz and .hmp.h5 accepted.
public FILLINImputationPlugin outputFilename(java.lang.String value)
Set Output filename. Output file; hmp.txt.gz and .hmp.h5 accepted.
value - Output filenamepublic java.lang.Integer preferredHaplotypeSize()
Preferred haplotype block size in sites (use same as in FILLINFindHaplotypesPlugin)
public FILLINImputationPlugin preferredHaplotypeSize(java.lang.Integer value)
Set Preferred haplotype size. Preferred haplotype block size in sites (use same as in FILLINFindHaplotypesPlugin)
value - Preferred haplotype sizepublic java.lang.Double heterozygosityThreshold()
Threshold per taxon heterozygosity for treating taxon as heterozygous (no Viterbi, het thresholds).
public FILLINImputationPlugin heterozygosityThreshold(java.lang.Double value)
Set Heterozygosity threshold. Threshold per taxon heterozygosity for treating taxon as heterozygous (no Viterbi, het thresholds).
value - Heterozygosity thresholdpublic java.lang.Double maxErrorToImputeOneDonor()
Maximum error rate for applying one haplotype to entire site window
public FILLINImputationPlugin maxErrorToImputeOneDonor(java.lang.Double value)
Set Max error to impute one donor. Maximum error rate for applying one haplotype to entire site window
value - Max error to impute one donorpublic java.lang.Double maxCombinedErrorToImputeTwoDonors()
Maximum error rate for applying Viterbi with to haplotypes to entire site window
public FILLINImputationPlugin maxCombinedErrorToImputeTwoDonors(java.lang.Double value)
Set Max combined error to impute two donors. Maximum error rate for applying Viterbi with to haplotypes to entire site window
value - Max combined error to impute two donorspublic java.lang.Integer minSitesToTestMatch()
Minimum number of sites to test for IBS between haplotype and target in focus block
public FILLINImputationPlugin minSitesToTestMatch(java.lang.Integer value)
Set Min sites to test match. Minimum number of sites to test for IBS between haplotype and target in focus block
value - Min sites to test matchpublic java.lang.Integer minNumOfMinorAllelesToCompare()
Minimum number of informative minor alleles in the search window (or 10X major)
public FILLINImputationPlugin minNumOfMinorAllelesToCompare(java.lang.Integer value)
Set Min num of minor alleles to compare. Minimum number of informative minor alleles in the search window (or 10X major)
value - Min num of minor alleles to comparepublic java.lang.Integer maxDonorHypotheses()
Maximum number of donor hypotheses to be explored
public FILLINImputationPlugin maxDonorHypotheses(java.lang.Integer value)
Set Max donor hypotheses. Maximum number of donor hypotheses to be explored
value - Max donor hypothesespublic java.lang.Boolean imputeAllHetCalls()
Write all imputed heterozygous calls as such, even if the original file has a homozygous call. (Not recommended for inbred lines.)
public FILLINImputationPlugin imputeAllHetCalls(java.lang.Boolean value)
Set Impute all het calls. Write all imputed heterozygous calls as such, even if the original file has a homozygous call. (Not recommended for inbred lines.)
value - Impute all het callspublic java.lang.Boolean combineTwoHaplotypesAsHeterozygote()
If true, uses combination mode in focus block, else does not impute
public FILLINImputationPlugin combineTwoHaplotypesAsHeterozygote(java.lang.Boolean value)
Set Combine two haplotypes as heterozygote. If true, uses combination mode in focus block, else does not impute
value - Combine two haplotypes as heterozygotepublic java.lang.Boolean outputProjectionAlignment()
Create a projection alignment for high density markers
public FILLINImputationPlugin outputProjectionAlignment(java.lang.Boolean value)
Set Output projection alignment. Create a projection alignment for high density markers
value - Output projection alignmentpublic java.lang.Boolean imputeDonorFile()
Impute the donor file itself
public FILLINImputationPlugin imputeDonorFile(java.lang.Boolean value)
Set Impute donor file. Impute the donor file itself
value - Impute donor filepublic java.lang.Boolean supressSystemOut()
Supress system out
public FILLINImputationPlugin supressSystemOut(java.lang.Boolean value)
Set Supress system out. Supress system out
value - Supress system outpublic java.lang.Boolean calculateAccuracy()
Masks input file before imputation and calculates accuracy based on masked genotypes
public FILLINImputationPlugin calculateAccuracy(java.lang.Boolean value)
Set Calculate accuracy. Masks input file before imputation and calculates accuracy based on masked genotypes
value - Calculate accuracypublic java.lang.Double proportionOfGenotypesToMaskIfNoDepth()
Proportion of genotypes to mask for accuracy calculation if depth not available
public FILLINImputationPlugin proportionOfGenotypesToMaskIfNoDepth(java.lang.Double value)
Set Proportion of genotypes to mask if no depth. Proportion of genotypes to mask for accuracy calculation if depth not available
value - Proportion of genotypes to mask if no depthpublic java.lang.Integer depthOfGenotypesToMask()
Depth of genotypes to mask for accuracy calculation if depth information available
public FILLINImputationPlugin depthOfGenotypesToMask(java.lang.Integer value)
Set Depth of genotypes to mask. Depth of genotypes to mask for accuracy calculation if depth information available
value - Depth of genotypes to maskpublic java.lang.Double proportionOfDepthGenotypesToMask()
Proportion of genotypes of given depth to mask for accuracy calculation if depth available
public FILLINImputationPlugin proportionOfDepthGenotypesToMask(java.lang.Double value)
Set Proportion of depth genotypes to mask. Proportion of genotypes of given depth to mask for accuracy calculation if depth available
value - Proportion of depth genotypes to maskpublic java.lang.String optionalKeyToCalculateAccuracy()
Key to calculate accuracy. Genotypes missing (masked) in target file should be present in key, with all other sites set to missing. Overrides otheraccuracy options if present and all sites and taxa present in target file present
public FILLINImputationPlugin optionalKeyToCalculateAccuracy(java.lang.String value)
Set Optional key to calculate accuracy. Key to calculate accuracy. Genotypes missing (masked) in target file should be present in key, with all other sites set to missing. Overrides otheraccuracy options if present and all sites and taxa present in target file present
value - Optional key to calculate accuracypublic java.lang.Boolean calculateAccuracyWithinMAFCategories()
Calculate R2 accuracy within MAF categories based on donor file
public FILLINImputationPlugin calculateAccuracyWithinMAFCategories(java.lang.Boolean value)
Set Calculate accuracy within MAF categories. Calculate R2 accuracy within MAF categories based on donor file
value - Calculate accuracy within MAF categories