public class MergeDuplicateSNPsPlugin
extends AbstractPlugin
This class is intended to be run directly after DiscoverySNPCallerPlugin, using the HapMap file from that step as input. It finds duplicate SNPs in the HapMap file, and merges them if they have the same pair of alleles (not necessarily in the same maj/min order) and if there mismatch rate is no greater than the threshold (-maxMisMat). If -callHets is on, then genotypic disagreements will be called heterozygotes (otherwise set to 'N' = default). By default, any remaining unmerged duplicate SNPs (but not indels) will be deleted. They can be kept by invoking the -kpUnmergDups option. If the germplasm is not fully inbred, and still contains residual heterozygosity (like the maize NAM or IBM populations do) then -callHets should be on and -maxMisMat should be set fairly high (0.1 to 0.2, depending on the amount of heterozygosity). Todo the VCF support has been commented out, but this should all be merged into the main pipeline.