| Class | Description |
|---|---|
| AnnotateTOPM |
Methods to annotate TOPM file, including adding mapping info from aligners, adding PE tag position and genetic position, model prediction for the best position
|
| AnnotateTOPMwSAMPlugin |
This class reads in SAM mapping results tests them against an anchor map and creates a updated HDF5 TOPM file. TODO: Add mapping information from Bowtie2 Add mapping information from BWA Add mapping information from BLAST? Run genetic to compare hypotheses Call resort
|
| Barcode |
Container class for storing information on GBS barcodes.
|
| BinaryToTextPlugin | |
| Clusters | |
| CompareGenosBetweenHapMapFilesPlugin | |
| ContigPETagCountPlugin | |
| DiscoverySNPCallerPlugin |
This class aligns tags at the same physical location against one another, calls SNPs, and then outputs the SNPs to a HapMap file. It is multi-threaded, as there are substantial speed increases with it.
|
| FastqToPETagCountPlugin |
Derives a PETagCount list for a pair of Fastq files. The forward and backward tags are ordered during processing Keeps only good reads having a barcode and a cut site and no N's in the useful part of the sequence. For the barcoded end, trims off the barcodes and truncates sequences that (1) have a second cut site, or (2) read into the common adapter. For the unbarcoded end, trims off the barcodes and truncates sequences that (1) have a second cut site, or (2) read into the barcode adapter.
|
| FastqToTBTPlugin |
This pipeline converts a series of fastq files to TagsByTaxa files (one per fastq file). It requires a list of existing tags (Tags object), which may come from a TagCounts file or TOPM file.
|
| FastqToTagCountPlugin |
Derives a tagCount list for each fastq file in the input directory. Keeps only good reads having a barcode and a cut site and no N's in the useful part of the sequence. Trims off the barcodes and truncates sequences that (1) have a second cut site, or (2) read into the common adapter.
|
| KeepSpecifiedReadsinFastqPlugin | |
| KeepSpecifiedSitesInTOPMPlugin | |
| KmerToTBTPlugin |
This pipeline converts a series of fastq files to TagsByTaxa files (one per fastq file). It requires a list of existing tags (Tags object), which may come from a TagCounts file or TOPM file.
|
| KmerToTagCountPlugin |
Derives a tagCount list for each fastq file in the input directory. Keeps only good reads having a barcode and a cut site and no N's in the useful part of the sequence. Trims off the barcodes and truncates sequences that (1) have a second cut site, or (2) read into the common adapter.
|
| MergeDuplicateSNPsPlugin |
This class is intended to be run directly after DiscoverySNPCallerPlugin, using the HapMap file from that step as input. It finds duplicate SNPs in the HapMap file, and merges them if they have the same pair of alleles (not necessarily in the same maj/min order) and if there mismatch rate is no greater than the threshold (-maxMisMat). If -callHets is on, then genotypic disagreements will be called heterozygotes (otherwise set to 'N' = default). By default, any remaining unmerged duplicate SNPs (but not indels) will be deleted. They can be kept by invoking the -kpUnmergDups option. If the germplasm is not fully inbred, and still contains residual heterozygosity (like the maize NAM or IBM populations do) then -callHets should be on and -maxMisMat should be set fairly high (0.1 to 0.2, depending on the amount of heterozygosity). Todo the VCF support has been commented out, but this should all be merged into the main pipeline.
|
| MergeMultipleTOPMPlugin | |
| MergeMultipleTagCountPlugin | |
| MergePETagCountPlugin |
Merge PETagCounts file of each taxon into one master PETagCounts file (collapsed and sorted)
|
| MergeTagsByTaxaFilesByRowPlugin |
Merges multiple TagsByTaxa files that are too large to fit in memory when combined. Currently, the output file stores only presence or absence of a tag in a taxon, so the merged count is the boolean OR of the individual counts. The program loops over files to determine their size, creates a RandomAccessFile object large enough to hold the merged data, then loops over each file again to fill the RandomAccessFile.
|
| MergeTagsByTaxaFilesPlugin |
Merges multiple TagsByTaxa files that are too large to fit in memory when combined. Currently, the output file stores only presence or absence of a tag in a taxon, so the merged count is the boolean OR of the individual counts. The program loops over files to determine their size, creates a RandomAccessFile object large enough to hold the merged data, then loops over each file again to fill the RandomAccessFile.
|
| ModifyTBTHDF5Plugin |
This pipeline modifies TagsByTaxa HDF5 file with data organized by taxa. It can: 1. Create an empty TBT. 2. Merge two TBT 3. Combined similarly named taxa 4. Pivot a taxa TBT to a tag TBT
|
| PEParseBarcodeRead | |
| PEReadBarcodeResult |
Container class for returning the results of parsed barcoded sequencing read.
|
| ParseBarcodeRead |
Takes a key file and then sets up the methods to decode a read from the sequencer. The key file decribes how barcodes are related to their taxon. Generally, a keyfile with all flowcells is included, and then the flowcell and lane to be processed are indicated in the constructor.
|
| PolymorphismFinder | |
| ProductionPipeline | |
| ProductionPipelineMain |
This class is for running the GBS Production Pipeline. It is to be run from within the sTASSEL.jar. The cron job should be set up to run the run_pipeline.pl which has been modified to make this class the main() to be run. The JVM memory settings within run_pipeline.pl should also be adjusted upwards. cron example 20 3 * * * cd /workdir/tassel/tassel4-src && /usr/local/bin/perl /workdir/tassel/tassel4-src/run_prod_cron.pl >> /workdir/tassel/tassel4-src/20130808_cron.log 2>&1 20130718 Note from Jeff Glaubitz: A minor detail: ProductionSNPCallerPlugin needs the key file name to end with "_key.txt". All the output files are named after the key file but replacing "_key.txt" with the appropriate extension. User: dkroon Date: 4/8/13
|
| ProductionSNPCallerPlugin |
This plugin converts all of the fastq (and/or qseq) files in the input folder and keyfile to genotypes and adds these to a genotype file in HDF5 format. We refer to this step as the "Production Pipeline". The output format is HDF5 genotypes with allelic depth stored. SNP calling is quantitative with the option of using either the Glaubitz/Buckler binomial method (pHet/pErr > 1 = het) (=default), or the Stacks method. Merging of samples with the same LibraryPrepID is handled by GenotypeTableBuilder.addTaxon(), with the genotypes re-called based upon the new depths. Therefore, if you want to keep adding genotypes to the same target HDF5 file in subsequent runs, use the -ko (keep open) option so that the output GenotypeTableBuilder will be mutable, using closeUnfinished() rather than build(). If the target output HDF5 GenotypeTable file doesn't exist, it will be created. Each taxon in the HDF5 file is named "ShortName:LibraryPrepID" and is annotated with "Flowcell_Lanes" (=source seq data for current genotype). Requires a TOPM with variants added from a previous "Discovery Pipeline" run. In binary topm or HDF5 format (TOPMInterface). TODO add the Stacks likelihood method to BasicGenotypeMergeRule
|
| QseqToPETagCountPlugin |
Derives a PETagCount list for a pair of Qseq files. The forward and backward tags are ordered during processing Keeps only good reads having a barcode and a cut site and no N's in the useful part of the sequence. For the barcoded end, trims off the barcodes and truncates sequences that (1) have a second cut site, or (2) read into the common adapter. For the unbarcoded end, trims off the barcodes and truncates sequences that (1) have a second cut site, or (2) read into the barcode adapter.
|
| QseqToTBTPlugin |
This pipeline converts a series of qseq files to TagsByTaxa files (one per qseq file). It requires a list of existing tags (Tags object), which may come from a TagCounts file or TOPM file.
|
| QseqToTagCountPlugin |
Derives a tagCount list for each qseq file in the qseqDirectory. Keeps only good reads having a barcode and a cut site and no N's in the useful part of the sequence. Trims off the barcodes and truncates sequences that (1) have a second cut site, or (2) read into the common adapter.
|
| ReadBarcodeResult |
Container class for returning the results of parsed barcoded sequencing read.
|
| SAMConverterPlugin |
This class can read in a CBSU TagMapFile into the gbs.TagsOnPhysicalMap data structure.
|
| SAMWGMapConverterPlugin |
This class reads in SAM mapping results tests them against an anchor map and creates a update HDF5 TOPM file
|
| SNPLogging | |
| SeqToTBTHDF5Plugin |
This pipeline converts a series of fastq or qseq files to a single TagsByTaxa HDF5 file. It requires a list of existing tags (Tags object), which may come from a TagCounts file or TOPM file.
|
| ShortReadBarcodeResult |
Container class for returning the results of parsed barcoded sequencing read. The length of read is in short. Max length is 32767 bp.
|
| SimpleGenotypeSBit |
Store sBit for tag genetic mapping and LD detection, optimized for really fast calculation
|
| SmithWaterman |
This class implement the classic local alignment algorithm (with linear gap penalty function) due to T.F.Smith and M.S.Waterman (1981).
|
| TOPMSummaryPlugin | |
| TagAgainstAnchor | |
| TagAgainstAnchorHypothesis | |
| TagAgainstAnchorPlugin |
Genetic mapping of GBS tags. Steps: 1. Using -pc and -t option to calculate the number of TBT chunks. 2. When -pc is 0, run the program on cluster. Submit jobs to nodes by specifying chunkStartIndex and chunkEndIndex
|
| TagBlockPosition |
Stores physical position of tags. The physical position of tag is used to block the corresponding marker in genetic mapping if the tag is mapping to the marker coming from itself Positions come from TOPM alignment hypothesis or the best position from machine learning prediction
|
| TagCountToFastqPlugin | |
| TagMatchFinder |
Very simple but fast homology finder. Very similar to BLAT, with long word searches. A in memory index is created from any set of Tags, and this list many then be quickly queried for high homology hits.
|
| TerryPipelines | |
| TextToBinaryPlugin | |
| UNetworkFilter | |
| UTagCountToTagPairPlugin | |
| UTagPairFinder | |
| UTagPairToTOPMPlugin |