public class HaplotypeClusterer
HaplotypeClusterer clusters haplotypes by distance as defined in the Haplotype class. This class can be extended to apply different clustering rules.
public HaplotypeClusterer(java.util.List<kotlin.Array[]> haplotypes)
haplotypes - a List of haplotype or genotype sequence to be clusteredpublic HaplotypeClusterer(java.util.ArrayList<net.maizegenetics.analysis.clustering.Haplotype> haplotypes)
haplotypes - a List of Haplotypes to be clusteredpublic void makeClusters()
Groups a list of Haplotypes into clusters. Clusters are created so that all Haplotypes in a cluster have 0 pairwise distance. Because of missing data a Haplotype can be assigned to more than one cluster.
public HaplotypeCluster removeFirstHaplotypes(int maxdistance)
Removes all haplotypes within maxdistance of the haplotype of the first cluster. After the haplotypes have been removed clusters are remade and sorted.
maxdistance - all haplotypes within maxdistance or less of the haplotype of the first cluster are removed from the haplotypeListpublic void sortClusters()
Sorts clusters according to HaplotypeCluster sort order.
public int getNumberOfClusters()
public kotlin.Array[] getClusterSizes()
public kotlin.Array[] getClusterScores()
After the initial cluster formation a Haplotype score equals the 1 / (number of clusters to which it belongs). Merging does not update the cluster score.
public java.util.ArrayList<net.maizegenetics.analysis.clustering.HaplotypeCluster> getClusterList()
public static int clusterDistanceDistinctHaplotypes(HaplotypeCluster cluster0, HaplotypeCluster cluster1)
cluster0 - cluster1 - public static double clusterDistanceDistinctHaplotypeProportion(HaplotypeCluster cluster0, HaplotypeCluster cluster1)
cluster0 - cluster1 - public static int clusterDistanceClusterHaplotypeDiff(HaplotypeCluster cluster0, HaplotypeCluster cluster1)
cluster0 - cluster1 - public static double clusterDistanceClusterDiffProportion(HaplotypeCluster cluster0, HaplotypeCluster cluster1)
cluster0 - cluster1 - public static int clusterDistanceMaxPairDiff(HaplotypeCluster cluster0, HaplotypeCluster cluster1)
cluster0 - cluster1 - public static double clusterDistanceAveragePairDiff(HaplotypeCluster cluster0, HaplotypeCluster cluster1)
cluster0 - cluster1 - public static int clusterDistanceTotalPairDiff(HaplotypeCluster cluster0, HaplotypeCluster cluster1)
cluster0 - cluster1 - public static java.util.ArrayList<net.maizegenetics.analysis.clustering.HaplotypeCluster> getMergedClusters(java.util.ArrayList<net.maizegenetics.analysis.clustering.HaplotypeCluster> candidateClusters,
int maxdiff)
Merges clusters whose maximum pairwise difference is less than maxdiff. Clusters are tested sequentially. That is, if two clusters are merged, they become the new head cluster against which remaining clusters are tested for merging.
candidateClusters - an ArrayList of HaplotypeClustersmaxdiff - public void mergeClusters(int maxdiff)
Merges clusters whose maximum pairwise difference is less than maxdiff. Clusters are tested sequentially. That is, if two clusters are merged, they become the new head cluster against which remaining clusters are tested for merging.
maxdiff - public static boolean doMerge(HaplotypeCluster c0, HaplotypeCluster c1, int maxdiff)
Tests whether two clusters are less than or equal to maxdiff distant. Uses clusterDistanceMaxPairDiff() to calculate distance.
c0 - a clusterc1 - another clustermaxdiff - public static void mergeTwoClusters(HaplotypeCluster c0, HaplotypeCluster c1)
Merges two clusters, c0 and c1.
c0 - c1 - public void removeClusterHaplotypesFromOtherClusters(int clusterIndex)
For this cluster, remove all of its haplotypes from other clusters, adjust the cluster scores, and re-sort the clusters. As a result, cluster size will equal cluster score for this cluster.
clusterIndex - the index of a cluster in the cluster listpublic void moveAllPossibleHaplotypesToCluster(int clusterIndex,
boolean fromClustersWithHigherIndexOnly,
int maxdiff)
For the indexed cluster, move all haplotypes consistent with the cluster haplotype to this cluster from any other cluster
clusterIndex - the index of a cluster in the cluster listpublic void moveAllHaplotypesToBiggestCluster(int maxdiff)
For each cluster, from largest to smallest, move all consistent haplotypes from other clusters to that cluster
public void removeHeterozygousClusters(int maxHetSites)
public void recalculateScores()
Recalculates the scores of the clusters in the cluster list. Removes any clusters with a score of 0.