public class RGBSProductionSNPCallerPlugin
extends AbstractPlugin
This plugin converts all of the fastq (and/or qseq) files in the input folder and keyfile to genotypes and adds these to a genotype file in HDF5 format. We refer to this step as the "Production Pipeline". The output format is either HDF5 or VCF genotypes with allelic depth stored. Output file type is determined by presence of the ".h5" suffix. SNP calling is quantitative with the option of using either the Glaubitz/Buckler binomial method (pHet/pErr > 1 = het) (=default), or the Stacks method. Merging of samples with the same LibraryPrepID is handled by GenotypeTableBuilder.addTaxon(), with the genotypes re-called based upon the new depths. Therefore, if you want to keep adding genotypes to the same target HDF5 file in subsequent runs, use the -ko (keep open) option so that the output GenotypeTableBuilder will be mutable, using closeUnfinished() rather than build(). If the target output is HDF5, and that GenotypeTable file doesn't exist, it will be created. Each taxon in the output file is named "ShortName:LibraryPrepID" and is annotated with "Flowcell_Lanes" (=source seq data for current genotype). Requires a database with variants added from a previous "Discovery Pipeline" run. References to "tag" are being replaced by references to "kmer" as the pipeline is really a kmer alignment process. TODO add the Stacks likelihood method to BasicGenotypeMergeRule
public RGBSProductionSNPCallerPlugin()
public RGBSProductionSNPCallerPlugin(java.awt.Frame parentFrame,
boolean isInteractive)
public void postProcessParameters()
public void setTagLenException()
public javax.swing.ImageIcon getIcon()
public java.lang.String getButtonName()
public java.lang.String getToolTipText()
public TagData runPlugin(DataSet input)
Convenience method to run plugin with one return object.
public java.lang.String inputDirectory()
Input directory containing fastq AND/OR qseq files.
public RGBSProductionSNPCallerPlugin inputDirectory(java.lang.String value)
Set Input Directory. Input directory containing fastq AND/OR qseq files.
value - Input Directorypublic java.lang.String keyFile()
Key file listing barcodes distinguishing the samples
public RGBSProductionSNPCallerPlugin keyFile(java.lang.String value)
Set Key File. Key file listing barcodes distinguishing the samples
value - Key Filepublic java.lang.String inputGBSDatabase()
Input Database file if using SQLite
public RGBSProductionSNPCallerPlugin inputGBSDatabase(java.lang.String value)
Set Input GBS Database. Input Database file if using SQLite
value - Input GBS Databasepublic java.lang.String outputGenotypesFile()
Output (target) genotypes file to add new genotypes to (new file created if it doesn't exist)
public RGBSProductionSNPCallerPlugin outputGenotypesFile(java.lang.String value)
Set Output Genotypes File. Output (target) genotypes file to add new genotypes to (new file created if it doesn't exist)
value - Output Genotypes Filepublic java.lang.Double aveSeqErrorRate()
Average sequencing error rate per base (used to decide between heterozygous and homozygous calls)
public RGBSProductionSNPCallerPlugin aveSeqErrorRate(java.lang.Double value)
Set Ave Seq Error Rate. Average sequencing error rate per base (used to decide between heterozygous and homozygous calls)
value - Ave Seq Error Ratepublic java.lang.Integer maxDivergence()
Maximum divergence (edit distance) between new read and previously mapped read (Default: 0 = perfect matches only)
public RGBSProductionSNPCallerPlugin maxDivergence(java.lang.Integer value)
Set Max Divergence. Maximum divergence (edit distance) between new read and previously mapped read (Default: 0 = perfect matches only)
value - Max Divergencepublic java.lang.Boolean depthToOutput()
Output depth: write depths to the output hdf5 genotypes file
public RGBSProductionSNPCallerPlugin depthToOutput(java.lang.Boolean value)
User sets true or false, indicating if they do or do not want depth information written to the HDF5 file.
value - Write depth to output filepublic java.lang.Integer kmerLength()
Maximum Tag Length
public RGBSProductionSNPCallerPlugin kmerLength(java.lang.Integer value)
Set Maximum Tag Length: User should set this value equivalent to what was used in GBSSeqToTagDBPlugin for maximum tag length when creating the database. If the two values are not equal inconsistent results may occur.
value - Maximum Tag Lengthpublic java.lang.Double positionQualityScore()
Minimum Position Quality Score
public RGBSProductionSNPCallerPlugin positionQualityScore(java.lang.Double value)
Set Minimum quality score for position: This value is used to pull SNPs out of the snpposition table. Only snps with quality scores meeting or exceeding the specified value will be processed.
value - Minimum position quality scorepublic java.lang.Integer batchSize()
Batch size for processing fastq files
public RGBSProductionSNPCallerPlugin batchSize(java.lang.Integer value)
Set number of Fastq files processed simultaneously
value - public java.lang.Integer minimumQualityScore()
Minimum quality score within the barcode and read length to be accepted
public RGBSProductionSNPCallerPlugin minimumQualityScore(java.lang.Integer value)
Set Minimum quality score. Minimum quality score within the barcode and read length to be accepted
value - Minimum quality score