public class HapBreakpoints_IFLFilePlugin
extends AbstractPlugin
THis is defined in tas-1098 This plugin takes a haplotype breakpoint file and creates intermediate files for the hap_breakpoint and breakpoint_set tables. These three tables are currently proprietary tables created by Buckler lab to be added to the GOBII postgres DB. THe tables are created from the create_hapBrkptTables.sql on cbsudc01 in directory /workdir/lcj34/postgresFiles/gobii_ifl_filesToLoad/gobii_hapbreakpoints THe tables to be created have these entries: hapbreakpoint Table: hap_breakpoint_id int taxa(GID) int position_range int4range (start/stop stored as an integer range) donor1 (GID) int donor2 (GID) int breakpoint_set_id int (maps to breapoint_set table) breakpoint_set Table: breakpoint_set_id int name text method text (method used to created breakpoint file ,e.g FILLIN) mapset_id int projection_align Table: projection_align_id int name text het_resolution text breakpoint_set_id int dataset_id int (gives us the donor file) The thought is that GOBII IFL scripts will operate successfully on them. It will require the tables to be previously created in the DB, and mapping files to be created and stored on the db server that can be used to process these files for bulk loading. THis method will take the set name and use it to populate the breakpoint table. Need to create the index into each table like GOBII does, with auto increment and they live in the pg_catalog. Restrictions: The breakfile format must be as defined by Ed. The first line must contain 2 tab-delimited values, the first indicating the number of donors and the second indicating the number of "blocks" to process. All lines beginning with a "#" are considered comment lines. There are 2 mapping files required: one for donors, and one for taxa. in the Ames example, the donor mapping file is the same file used to curate the WGS dataset 5, which is named ZeaWGS_hmp321_raw_AGPv3. The taxa mapping file is the file Cinta provides with each subset, e.g. for Ames in Cornell box. If the TaxaColumn in the mapping file contains a libraryID (e.g. name:libraryID) the code removes it, leaving taxa set to just the "name" portion. THis is because Kelly's pa.txt.gz files do NOT have the library portion, though some of Cinta's files do. The breakpoint file has donors at top: Donors are Feb-48,PHG84, PHG83, etc. Their GIDs come from the WGS mapping file Cinta gave for curating that large vcf set. 1210 2711 #Donor Haplotypes 0 Feb-48 1 PHG84 2 PHG83 And taxa breakpoint blocks at the bottom: "taxa" is the first column. It gets it GID from the germplasminformation_Ames_20160810.txt file Cinta provided in Cornell Box with the Ames data: Below, 12E and 37 are taxa. #Block are defined chr:startPos:endPos:donor1:donor2 (-1 means no hypothesis) 12E 1:299497909:299752426:958:1070 1:299777120:300064510:79:958 1:300064521:300335541:1050:1202 1:300351838:300818858:162:7 37 2:233507851:234268676:1015:1015 2:234268677:234350614:820:820 2:234350639:234411604:950:950 2:234411637:234464003:191:1 The third table, the projection_alignment table, will be populated with another plugin. Users will create their own projection alignment analyses for the breakpoint sets of their choice. ------------- Sept 14, 2016: Trying with Cinta's "name2" field, as the taxa are not matching from the taxa column for the taxa mapping file. Donors are working.
public HapBreakpoints_IFLFilePlugin(java.awt.Frame parentFrame,
boolean isInteractive)
public HapBreakpoints_IFLFilePlugin()
protected void postProcessParameters()
public java.util.HashMap<java.lang.String,java.lang.String> createNameGidMap(java.lang.String mappingFile,
java.lang.String taxaField)
public javax.swing.ImageIcon getIcon()
public java.lang.String getButtonName()
public java.lang.String getToolTipText()
public static void main(java.lang.String[] args)
public java.lang.String breakFile()
Full path to file containing breakpoint blocks for projection alignment
public HapBreakpoints_IFLFilePlugin breakFile(java.lang.String value)
Set Breakpoint File. Full path to file containing breakpoint blocks for projection alignment
value - Breakpoint Filepublic java.lang.String setName()
Name to be given to this set of breakpoints. This name will be stored in the breakpointSet table.
public HapBreakpoints_IFLFilePlugin setName(java.lang.String value)
Set Breakpoint Set Name. Name to be given to this set of breakpoints. This name will be stored in the breakpointSet table.
value - Breakpoint Set Namepublic java.lang.String mapset()
Name of the mapset to which these breakpoints refer. Must match an existing name in the mapset table, e.g AGPV3
public HapBreakpoints_IFLFilePlugin mapset(java.lang.String value)
Set mapset. Name of the mapset to which these breakpoints refer. Must match an existing name in the mapset table, e.g AGPV3
value - mapsetpublic java.lang.String src_dataset()
Name of the dataset from which these breakpoints were created. Must match an existing name in the dataset table.
public HapBreakpoints_IFLFilePlugin src_dataset(java.lang.String value)
Set src_dataset. Name of the dataset from which these breakpoints were created. Must match an existing name in the dataset table,
value - src_datasetpublic java.lang.String outputDir()
Full path name of directory to which output files will be written, must end with a /
public HapBreakpoints_IFLFilePlugin outputDir(java.lang.String value)
Set Path of output directory. Full path name of directory to which output files will be written, must end with a /
value - Path of output directorypublic java.lang.String method()
Method used to created the breakpoints, e.g. FILLIN, beagle, etc
public HapBreakpoints_IFLFilePlugin method(java.lang.String value)
Set Breakpoint Method. Method used to created the breakpoints, e.g. FILLIN, beagle, etc
value - Breakpoint Methodpublic java.lang.String donorMapFile()
tab-delimited File containing a TaxaColumn name to be used to map breakFile donors with a GID. This may be the same file that was used for adding germplasm and marker data.
public HapBreakpoints_IFLFilePlugin donorMapFile(java.lang.String value)
Set mappingFile. tab-delimited File containing a TaxaColumn name to be used to map breakFile donors with a GID. This may be the same file that was used for adding germplasm and marker data.
value - mappingFilepublic java.lang.String taxaMapFile()
tab-delimited File containing a TaxaColumn name to be used to map breakFile taxa with a GID. This may be a different file than was used for adding germplasm and marker data.
public HapBreakpoints_IFLFilePlugin taxaMapFile(java.lang.String value)
Set mappingFile. tab-delimited File containing a TaxaColumn name to be used to map breakFile taxa with a GID. This may be a different file than was used for adding germplasm and marker data.
value - mappingFile