public class TagAgainstAnchor
public TagAgainstAnchor(java.lang.String hapMapHDF5,
java.lang.String tbtHDF5,
java.lang.String blockFileS,
java.lang.String outfileS,
double pThresh,
int minCount,
int coreNum,
int chunkSize)
Constructor to run genetic mapping on one node
hapMapHDF5 - anchor map, SimpleGenotypeSBit formattbtHDF5 - TagsByTaxa of HDF5 formatblockFileS - TBTTagBlockFile, used to block the position where the tag is aligned while scanningoutfileS - output file of genetic mappingpThresh - P-value threshold, default: 1e-6minCount - minimum count when tag appear in taxa (In TBT file), default = 20, too low number lacks statistical powercoreNum - default:-1, which means using all cores in a node. When the coreNum is set less than total core number, which means using coreNum cores, each core runs 1 threadchunkSize - number of tags in a chunk. This determines the time usage in a node/computerpublic TagAgainstAnchor(java.lang.String hapMapHDF5,
java.lang.String tbtHDF5,
java.lang.String blockFileS,
java.lang.String outfileS,
double pThresh,
int minCount,
int coreNum,
int chunkSize,
int chunkStartIndex,
int chunkEndIndex)
Constructor to run genetic mapping on specific chunk of TBT. This is for "qsub" in clusters.
hapMapHDF5 - anchor map, SimpleGenotypeSBit formattbtHDF5 - TagsByTaxa of HDF5 formatblockFileS - TBTTagBlockFile, used to block the position where the tag is aligned while scanningoutfileS - output file of genetic mappingpThresh - P-value threshold, default: 1e-6minCount - minimum count when tag appear in taxa (In TBT file), default = 20, too low number lacks statistical powercoreNum - default:-1, which means using all cores in a node. When the coreNum is set less than total core number, which means using coreNum cores, each core runs 1 threadchunkSize - number of tags in a chunk. This determines the time usage in a node/computerchunkStartIndex - start index of chunkchunkEndIndex - end index of chunk. Note: chunk of end index is exclusive.public int getChunkNum(int chunkSize)
Return the number of chunks at current chunkSize
chunkSize - public static int getChunkNum(java.lang.String tbtHDF5,
int chunkSize)
Pre-calculate number of chunks when qsub genetic mapping
tbtHDF5 - chunkSize - number of tags in a chunk. This determines the time usage in a node/computerpublic void MTMapping(java.lang.String outfileS)
MT genetic mapping
outfileS - public double fastTestSites(OpenBitSet obsTdist, OpenBitSet obsMajor, OpenBitSet obsMinor, double maf, cern.jet.random.Binomial binomFunc)
fast test, using ratio to reduce the calculation of binomial and p (reduce 73%), but the time is only saved by 3%
obsTdist - obsMajor - obsMinor - maf - binomFunc - public double testSites(OpenBitSet obsTdist, OpenBitSet obsMajor, OpenBitSet obsMinor, double maf, cern.jet.random.Binomial binomFunc)
Test association using binomial test, collecting P-value
obsTdist - obsMajor - obsMinor - binomFunc - public static double testSites(OpenBitSet obsTdist, OpenBitSet obsMajor, OpenBitSet obsMinor, cern.jet.random.Binomial binomFunc)
Test association using binomial test, collecting P-value
obsTdist - obsMajor - obsMinor - binomFunc -