Class WardLinkage

  • All Implemented Interfaces:
    AgglomerationMethod

    public final class WardLinkage
    extends java.lang.Object
    implements AgglomerationMethod
    The "Ward", "inner squared distance", "sum of squares", "error sum of squares", or "minimum variance" method. This method fuses those two clusters that result in the smallest increase in the total within-group error sum of squares. This quantity is defined as the sum of squared deviation of each object from the centroid of its own cluster. In contrast to the other methods that use prior criteria, this method is based on a posterior fusion criterion. [The data analysis handbook. By Ildiko E. Frank, Roberto Todeschini] Used only for Euclidean distance! The general form of the Lance-Williams matrix-update formula: d[(i,j),k] = ai*d[i,k] + aj*d[j,k] + b*d[i,j] + g*|d[i,k]-d[j,k]| For the "Ward" method: ai = (ci+ck)/(ci+cj+ck) aj = (cj+ck)/(ci+cj+ck) b = -ck/(ci+cj+ck) g = 0 Thus: d[(i,j),k] = (ci+ck)/(ci+cj+ck)*d[i,k] + (cj+ck)/(ci+cj+ck)*d[j,k] - ck/(ci+cj+ck)*d[i,j] = ( (ci+ck)*d[i,k] + (cj+ck)*d[j,k] - ck*d[i,j] ) / (ci+cj+ck)
    • Constructor Summary

      Constructors 
      Constructor Description
      WardLinkage()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      double computeDissimilarity​(double dik, double djk, double dij, int ci, int cj, int ck)
      Compute the dissimilarity between the newly formed cluster (i,j) and the existing cluster k.
      java.lang.String toString()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Constructor Detail

      • WardLinkage

        public WardLinkage()
    • Method Detail

      • computeDissimilarity

        public double computeDissimilarity​(double dik,
                                           double djk,
                                           double dij,
                                           int ci,
                                           int cj,
                                           int ck)
        Description copied from interface: AgglomerationMethod
        Compute the dissimilarity between the newly formed cluster (i,j) and the existing cluster k.
        Specified by:
        computeDissimilarity in interface AgglomerationMethod
        Parameters:
        dik - dissimilarity between clusters i and k
        djk - dissimilarity between clusters j and k
        dij - dissimilarity between clusters i and j
        ci - cardinality of cluster i
        cj - cardinality of cluster j
        ck - cardinality of cluster k
        Returns:
        dissimilarity between cluster (i,j) and cluster k.
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object