public class TagsOnPhysicalMapV3 extends AbstractTagsOnPhysicalMap implements TOPMInterface
HDF5 version of TagsOnPhysical Map. This is the preferred version of physical map as it uses less memory, loads faster, and is more flexible with mapping positions.
Multiple mapping positions can be stored for each Tag. For example, separate aligners could record their positions in the objects. Then the genetic mapping algorithm could be used to resolve, which is the true mapping position. The variables for the best position are added with a prefix "best". The fields in class TagMappingInfoV3 and class TagMappingInfoV3 are used to store physical positions and corresponding genetic positions. class TagGeneticMappingInfo
protected int mappingNum
Number of physical positions from different aligner or aligner with different parameters
public TagsOnPhysicalMapV3(java.lang.String theHDF5file)
Constructor from a HDF5 TOPM file
theHDF5file - public static void createFile(Tags inTags, java.lang.String newHDF5file)
Initialize HDF5 TOPM from TagCounts file. The "MAXMAPPING"(set to 0), "TAGLENGTHINLONG", "tags" and "tagLength" are added.
inTags - newHDF5file - public void writeTextMap(java.lang.String tagCountFileS,
java.lang.String outputFileS)
Write text format map and genetic map for methods development
outputFileS - public void writeSubTOPM(java.lang.String outputFileS,
kotlin.Array[] tagIndex)
public java.lang.String[] creatTagMappingInfoDatasets(int startIndex,
int size)
Creat datasets in HDF5 holding mapping information, which is used to annotate the TOPM with multiple alignment hypothesis
startIndex - Start index of tag mapping information. This is essentially the current mappingNumsize - the number of datasets which will be createdpublic java.lang.String[] creatTagGeneticMappingInfoDatasets(int startIndex,
int size)
Creat datasets in HDF5 holding genetic mapping information, which is used to test multiple alignment hypothesis
startIndex - size - public java.lang.String creatTagGeneticMappingInfoGWDataset()
Creat dataset in HDF5 holding genome wide genetic mapping information, which is used to build training dataset to predict hypothesis genetic mapping
public void writeBestMappingDataSets(kotlin.Array[] bestStrand,
kotlin.Array[] bestChr,
kotlin.Array[] bestStartPos,
kotlin.Array[] bestEndPos,
kotlin.Array[] bestDivergence,
kotlin.Array[] bestMapP,
kotlin.Array[] bestDcoP,
kotlin.Array[] multimaps,
kotlin.Array[] bestEvidence,
kotlin.Array[] bestMapIndex)
Write best mapping positions from all hypotheses to HDF5 datasets, including strand, chr, pos and number of multimaps
bestStrand - bestChr - bestStartPos - multimaps - public void writeTagMappingInfoDataSets(java.lang.String[] dataSetNames,
TagMappingInfoV3[] tmiChunk,
int chunkIndex)
Write TMI buffer/chunk to HDF5 datasets, which is used to annotate the TOPM with multiple alignment hypothesis
dataSetNames - tmiChunk - TMI chunk [dataSetNames.length]*[chunk_size]chunkIndex - index of this chunkpublic void writeTagGeneticMappingInfoDataSets(java.lang.String[] dataSetNames,
TagGeneticMappingInfo[] tgmiChunk,
int chunkIndex)
Write TGMI buffer/chunk to HDF5 datasets
dataSetNames - tgmiChunk - chunkIndex - public void writeTagGeneticMappingInfoGWDataSet(java.lang.String dataSetName,
TagGeneticMappingInfo[] tgmiChunk,
int chunkIndex)
Write whole genome genetic mapping TGMI buffer/chunk to HDF5 datasets
dataSetName - tgmiChunk - chunkIndex - public void setMappingNum(int mappingNum)
Set mappingNum attribute in HDF5
public int getChunkNum()
Return the total number of chunks
public int getChunkSize()
Return the chunk size (Number of tags in a chunk)
public void getFileReadyForClosing()
public int getMappingNum()
Return number of mapping result
public boolean getIfHasMapping()
Return if the file has alignment mapping annotation
public boolean getIfHasGeneticMapping()
Return if the file has genetic mapping test result
public boolean getIfHasGeneticMappingGW()
Return if the file has genome wide genetic mapping annotation
public kotlin.Array[] getUniqueMappingOfAligner(int tagIndex,
net.maizegenetics.dna.map.TagMappingInfoV3.Aligner alignerName)
Return unique mapping (chr and startPosition) from an aligner, return null if it has multiple equally good position or doesn't align. This is used to block positions for genetic mapping
tagIndex - alignerName - public int addVariant(int tagIndex,
byte offset,
byte base)
public int getBestMapIndex(int tagIndex)
public byte getDcoP(int tagIndex)
tagIndex - public byte getStrand(int tagIndex)
tagIndex - public byte getDivergence(int tagIndex)
Blast doesn't have divergence, so it always return Byte.MIN_VALUE of Blast hits
tagIndex - public int getStartPosition(int tagIndex)
tagIndex - public int getEndPosition(int tagIndex)
EndPosition of PEEnd1 is probably not the EndPosition of the tag
public byte getEvidence(int tagIndex)
Return the evidence
tagIndex - public byte getMapP(int tagIndex)
public byte getMultiMaps(int tagIndex)
tagIndex - public kotlin.Array[] getMappingIndicesOfAligner(net.maizegenetics.dna.map.TagMappingInfoV3.Aligner alignerName)
Return the map indices of an aligner
alignerName - public kotlin.Array[] getPositionArray(int tagIndex)
public int getReadIndexForPositionIndex(int posIndex)
public TagMappingInfoV3 getMappingInfo(int tagIndex, int mapIndex)
Return tag mapping information of a tag in one map
tagIndex - mapIndex - public TagMappingInfoV3[] getMappingInfoChunk(int tagIndex)
Return mapping information of a while chunk, avoid issues while multi threads are trying to get TMI info, specifically for hypothesis genetic mapping
tagIndex - public TagGeneticMappingInfo getGeneticMappingInfo(int tagIndex, int geneticMapIndex)
Return tag genetic mapping information of a tag in one genetic map
tagIndex - geneticMapIndex - public TagGeneticMappingInfo getGeneticMappingInfoGW(int tagIndex)
public kotlin.Array[] getUniquePositions(int chromosome)
public void setMultimaps(int index,
byte multimaps)
public void setChromoPosition(int index,
int chromosome,
byte strand,
int positionMin,
int positionMax)
public void setDivergence(int index,
byte divergence)
public void setMapP(int index,
byte mapP)
public void setMapP(int index,
double mapP)
public void setVariantDef(int tagIndex,
int variantIndex,
byte def)
public void setVariantPosOff(int tagIndex,
int variantIndex,
byte offset)
public void setAllVariantInfo(int tagIndex,
kotlin.Array[] defAndOffset)
Preferred method for setting variant information
tagIndex - defAndOffset - Two dimension [0=definition, 1=offset][upto 16 bytes for each SNP]public long sortTable(boolean byHaplotype)
public void clearVariants()