public class RepGenPhase2AlignerPlugin
extends AbstractPlugin
This plugin takes an existing repGen db, grabs the tags whose depth meets that specified in the minCount parameter, makes kmer seeds from these tags. Forward and reverse primer sequences are added as an input parameter. When a kmer seed is found on a reference chromosome, a ref sequence is created from 300bp before the hit, to 300 bp after the hit. This value is half the refKmerLen parameter passed by user. Default refKmerLen is 600. From the ref sequence created, a search is made for the primer pairs within this sequence. IF either both forward primer and the reverse complement of the reverse primer; or reverse primer and the reverse complement of the forward primer are found, a reference tag is created starting at the start of the first occurring primer from the primer pair found in the sequence. If both forward and reverse pairs are found, the ref tag is created based on the best match, defaulting to the forward primer if both are found. Search for additional kmer matches on the chromosome begins at the position on the ref chrom following the end of the second primer in the matched pair. The kmerLen field should match the length of the kmers stored as tags during the RepGenLoadSeqToDBPlugin step. The default is 150. The refKmerLen() should minimally be the length of the db kmer tags, but can be longer. Our defaults are 150 for kmer tags, and twice this length (300) for the refKmerLen. There are 2 count parameters: minTagCount specifies the minimum depth of a tag for it to be used when creating seed kmers. This plugin creates and stores the reference tags in the refTag table in the database. Both the tagMapping and the physicalMapPosition table will we populated with the reference tag information. Once the tables have been populated with the reference information, Smith Waterman is run to align all the nonreference tags in the db against each other; each non-reference tag against the reference tags; finally each refTag against all other refTags. ALignment data is stored in the tagAlignments table. Smith Waterman from SourceForge neobio project is used to determine alignment score. Settings for match rewards, mismatch penalty and gap penalty may be changed by user via plugin parameters.
public RepGenPhase2AlignerPlugin()
public RepGenPhase2AlignerPlugin(java.awt.Frame parentFrame)
public RepGenPhase2AlignerPlugin(java.awt.Frame parentFrame,
boolean isInteractive)
public void postProcessParameters()
public static void writeToFile(java.lang.String chrom,
com.google.common.collect.Multimap<java.lang.String,java.lang.Integer> chromMaximaMap,
int minCount)
public static void writePeakPositions(java.lang.String chrom,
int peak,
java.util.List<java.lang.Integer> peakPositions)
public static void writeChrom9Bits(java.util.List<java.lang.Integer> peakPositions)
public javax.swing.ImageIcon getIcon()
public java.lang.String getButtonName()
public java.lang.String getToolTipText()
public static void main(java.lang.String[] args)
public java.lang.String inputDB()
Input database file with tags and taxa distribution
public RepGenPhase2AlignerPlugin inputDB(java.lang.String value)
Set Input DB. Input database file with tags and taxa distribution
value - Input DBpublic java.lang.String refGenome()
Output fastq file to use as input for BWA or bowtie2
public RepGenPhase2AlignerPlugin refGenome(java.lang.String value)
Set Output File. Output fastq file to use as input for BWA or bowtie2
value - Output Filepublic java.lang.Integer minTagCount()
Minimum count of reads for a tag to be output
public RepGenPhase2AlignerPlugin minTagCount(java.lang.Integer value)
Set Min Count. Minimum count of reads for a tag to be output
value - Min Countpublic java.lang.Integer seedLen()
Length of seed kmers
public RepGenPhase2AlignerPlugin seedLen(java.lang.Integer value)
Set Seed kmer length. Length of seeds to use in aligning.
value - seed lengthpublic java.lang.Integer seedWindow()
Length of window between positions when creating seed from DB tags
public RepGenPhase2AlignerPlugin seedWindow(java.lang.Integer value)
Set Seed window length. Length of window between positions when creating seed from DB tags
value - seed lengthpublic java.lang.Integer kmerLen()
Length of kmers as tag sequences in the db
public RepGenPhase2AlignerPlugin kmerLen(java.lang.Integer value)
Set Kmer length. Length of kmers to be stored as tag sequences in the db
value - kmer lengthpublic java.lang.Integer refKmerLen()
Length of kmers as tag sequences in the db
public RepGenPhase2AlignerPlugin refKmerLen(java.lang.Integer value)
Set Kmer length. Length of kmers to be stored as tag sequences in the db
value - kmer lengthpublic java.lang.Integer match_reward()
Parameter sent to Smith Waterman aligner for use in calculating reward when base pairs match.
public RepGenPhase2AlignerPlugin match_reward(java.lang.Integer value)
Set Match Reward Amount. Parameter sent to Smith Waterman aligner for use in calculating reward when base pairs match.
value - Match Reward Amountpublic java.lang.Integer mismatch_penalty()
Parameter sent to Smith Waterman aligner for use in calculating penalty when base pairs are mis-matched.
public RepGenPhase2AlignerPlugin mismatch_penalty(java.lang.Integer value)
Set Mismatch Penalty Amount. Parameter sent to Smith Waterman aligner for use in calculating penalty when base pairs are mis-matched.
value - Mismatch Penalty Amountpublic java.lang.Integer gap_penalty()
Parameter sent to Smith Waterman aligner for use in calculating penalty when when a gap is identified.
public RepGenPhase2AlignerPlugin gap_penalty(java.lang.Integer value)
Set Gap Penalty Amount. Parameter sent to Smith Waterman aligner for use in calculating penalty when when a gap is identified.
value - Gap Penalty Amountpublic java.lang.String primers()
Tab delimited file that contains the column headers chrom,forward,reverse. The values in each column are the chromosone name, the forward primer sequence and the reverse primer sequence for the specified chromosome.
public RepGenPhase2AlignerPlugin primers(java.lang.String value)
Set Primers. Tab delimited file that contains the column headers chrom,forward,reverse. The values in each column are the chromosone name, the forward primer sequence and the reverse primer sequence for the specified chromosome.
value - Primers