public class TaxaListIOUtils
Utilities for reading and writing IdGroup and PedigreeIdGroups.
public static com.google.common.collect.Multimap<java.lang.String,net.maizegenetics.taxa.Taxon> getMapOfTaxonByAnnotation(TaxaList taxaList, java.lang.String annotation)
Create a Multimap of all the taxa associated with a particular annotation value.
taxaList - input taxa list with annotation associated withannotation - annotation key used to create the multimap, the values of these keys become the key of the resulting Multimappublic static java.util.Optional<java.util.Map> getUniqueMapOfTaxonByAnnotation(TaxaList taxaList, java.lang.String annotation)
Create a Map of all the taxa associated with a particular annotation value. If there would be a duplicate mapping, then an Optional.empty() is returned.
taxaList - input taxa list with annotation associated withannotation - annotation key used to create the map, the values of these keys become the key of the resulting mappublic static TaxaList subsetTaxaListByAnnotation(TaxaList baseTaxaList, java.lang.String annotation, java.lang.String annoValue)
Returns a subsetted taxa list based on annotation value. For example, return all taxa where GermType=Inbred.
baseTaxaList - base annotated taxa listannotation - annotation name (key)annoValue - annotation value being tested forpublic static TaxaList retainSpecificAnnotations(TaxaList baseTaxaList, java.lang.String[] annotationsToKeep)
Creates a new taxa list with the taxa only retaining annotations within a specified list. All taxa are retained, only the annotations are changed.
baseTaxaList - annotationsToKeep - the retained keys annotationpublic static TaxaList removeSpecificAnnotations(TaxaList baseTaxaList, java.lang.String[] annotationsToRemove)
Creates a new taxa list with the taxa retaining annotations EXCEPT those specified by the list. All taxa are retained, only the annotations are changed.
baseTaxaList - annotationsToRemove - the retained keys annotationpublic static java.util.Set<java.lang.String> allAnnotationKeys(TaxaList baseTaxaList)
Provides the set of all annotation key found in any of taxa
baseTaxaList - public static void exportAnnotatedTaxaListTable(TaxaList taxa, java.lang.String filename)
public static TaxaList importAnnotatedTaxaList(java.lang.String filename)
public static TaxaList readTaxaAnnotationFile(java.lang.String fileName, java.lang.String taxaNameField, java.util.Map<java.lang.String,java.lang.String> filters, boolean mergeSameNames)
Returns an annotated TaxaList from a text annotation file in matrix format. This is a tab delimited file. First row in the file with the field taxaNameField is the header row. taxaNameField indicated the taxon name, all other fields are user defined. The fields become the keys for the taxa annotation. Quantitative fields should be tagged with "#" sign, e.g. <#INBREEDF>. Multiple values are supported per key, and additional values can be either described with an additional column or ";" to delimit values with the same key.
Filters are a map of filters to be applied. Key are the fields, and value are what are tested for equality. Only taxa rows true for filters are retained.
Produces:
class Taxon, and these constant fields are all upper case.fileName - with complete pathtaxaNameField - field name with the taxon namefilters - Map of filter to determine which rows to retain as the file is processed.class Taxonpublic static java.util.ArrayList<net.maizegenetics.taxa.Taxon> readTaxaAnnotationFileAL(java.lang.String fileName,
java.lang.String taxaNameField,
java.util.Map<java.lang.String,java.lang.String> filters)
public static TaxaList readTaxaAnnotationFile(java.lang.String fileName, java.lang.String taxaNameField)
Returns an annotated TaxaList from a text annotation file in matrix format. This is a tab delimited file. First row in the file with the field taxaNameField is the header row. taxaNameField indicated the taxon name, all other fields are user defined. The fields become the keys for the taxa annotation. Quantitative fields should be tagged with "#" sign, e.g. <#INBREEDF>. Multiple values are supported per key, and additional values can be either described with an additional column or ";" to delimit values with the same key.
Produces:
class Taxon, and these constant fields are all upper case.fileName - with complete pathtaxaNameField - field name with the taxon nameclass Taxonpublic static boolean doesTaxonHaveAllAnnotations(Taxon taxon, java.util.Map<java.lang.String,java.lang.String> filters)
Tests whether a taxon has annotation values in the map
taxon - filters - public static com.google.common.collect.SetMultimap<java.lang.String,java.lang.String> parseVCFHeadersIntoMap(java.lang.String s)
Parses a VCF header with the taxa names and annotations into a multimap. The taxa name is return as the "ID" key, as used by the VCF format.
s - public static java.util.List<java.lang.String> readTissueAnnotationFile(java.lang.String fileName,
java.lang.String tissueNameField)
This method takes a key file and creates a SortedSet that contains a set of the tissue values. The set will be null if no tissues are present
fileName - - name of Keyfile containing Tissue headertissueNameField - - field name