public class PreProcessGOBIIMappingFilePlugin
extends AbstractPlugin
This plugin should be run prior to creating the intermediate files for marker and dnarun. There are 3 purposes to this plugin's. Using the mapping file created for the dataset: 1. Identify duplicate/missing germplasm/dnasample entries, create intermediate file for germplasm and dnasmple tables, load any missing entries. Duplicates are skipped. 2. Identify duplicate libraryPrepIds. Write a list of duplicate libraryPrepIds, write to a file. 3. Provide mapping data to load new marker/dnarun related tables. Create intermediate files, load via GOBII IFL scripts For the first 2 purposes, the database must be queried. Missing entries entries are defined as below: germplasm table: From the db,Get list of distinct MGIDs (they should all be distinct). use this list to compare to MGIDs in the file. For any MGIDs that don't appear, create a line in the *.germplasm intermediate file used to add values. dnasample table: From the db, Get a list of dnasample names. These names are a string comprised of these components: GID:plate:well. From the input file, for each entry, create a concatenanted string of GID:plate:well. compare to list from db. For any names that don't appear, create a line in the *.dnasample intermediate file for loading. This file needs the "name" field to be a concatenation of GID:plate:well as this will be unique and GOBII dnasample.dupmap looks at only the name field. Code can be MGID if we need that stored (which I think we do). It takes "external code" column instead of germplasm_id as that maps to the external_code field in the germplasm table when GOBII IFL looks to find the germplasm_id from DB. This file also needs project_name, which comes from the mapping file. dnarun table: From the db, Get a list of all dnasample.name fields. These should be distinct library prep id. Compare to libraryPrepIds from the mapping file. IF there are duplicate, write to a file to show the biologist. NOTES: GOBII uses dnasample.name and dnasample.num to determine duplicates BL is not populating dnasample.num. "num" has been removed from the dnasample.dupmap file when running this. For some reason, with it present, but all values "null", the script believed the values were different and I ended up duplicating all dnasamples when sending the file through the GOBII scripts. When I removed this line, the scripts only checked the "name" field and project id and it worked. For step 3: The intermediate files are created by the MarkerDNARunMGID_fromHMPIFIFIlePLugin.java. Note the dnasample and germplasm entries must be loaded to the db before loading the marker/ dnarun intermediate files or the necssary db ids will not be found..
public PreProcessGOBIIMappingFilePlugin(java.awt.Frame parentFrame,
boolean isInteractive)
public PreProcessGOBIIMappingFilePlugin()
public javax.swing.ImageIcon getIcon()
public java.lang.String getButtonName()
public java.lang.String getToolTipText()
public java.lang.String dbConfigFile()
DB connection config file
public PreProcessGOBIIMappingFilePlugin dbConfigFile(java.lang.String value)
Set dbConfigFile. DB connection config file
value - dbConfigFilepublic java.lang.String datasetName()
Name of dataset whose marker and dnarun IDs are to be pulled
public PreProcessGOBIIMappingFilePlugin datasetName(java.lang.String value)
Set dataset name. Name of dataset whose marker and dnarun IDs are to be pulled
value - dataset namepublic java.lang.String mappingFile()
tab-delimited File containing columns: taxaColumn, name, MGID, GID,libraryID, plate_code, well, species, type, project_id, experiment_name, platform_name, reference_name and dataset_name
public PreProcessGOBIIMappingFilePlugin mappingFile(java.lang.String value)
Set mappingFile. tab-delimited File containing columns: taxaColumn, name, MGID, GID,libraryID, plate_code, well, species, type, project_id, experiment_name, platform_name, reference_name and dataset_name
value - mappingFilepublic java.lang.String outputDir()
Full path name of output directory, must end with a /
public PreProcessGOBIIMappingFilePlugin outputDir(java.lang.String value)
Set Path of output directory. Full path name of output directory, must end with a /
value - Path of output directory