In a message dated 7/17/2011 8:16:27 P.M. Central Daylight Time, weh8@verizon.net writes: The RCC matrix yields a distance, similar to a GD, but not quite the same, between pairs of haplotypes of ALL the members inputted to the Excel Data Analysis Correlation Kit function (not the CORREL one). If you folks want to try this it's an add-on to Excel available on your installation CD or in the installation file on your harddisk if that's how you received the program. Look for "Use the Analysis ToolPak to perform complex data analysis" in the help files. It tells how to load the add-on. I did this in an older version of Excel 2000 and just recently in a new version of Office 2010 which came pre-loaded on the harddrive. It does not work in the freebie version of the program. In the help files for the Data analysis tookpak it says under correlation: The CORREL and PEARSON worksheet functions both calculate the correlation coefficient between two measurement variables when measurements on each variable are observed for each of N subjects. (Any missing observation for any subject causes that subject to be ignored in the analysis.) The Correlation analysis tool is particularly useful when there are more than two measurement variables for each of N subjects. It provides an output table, a correlation matrix, that shows the value of CORREL (or PEARSON) applied to each possible pair of measurement variables. The correlation coefficient, like the covariance, is a measure of the extent to which two measurement variables "vary together." Unlike the covariance, the correlation coefficient is scaled so that its value is independent of the units in which the two measurement variables are expressed. (For example, if the two measurement variables are weight and height, the value of the correlation coefficient is unchanged if weight is converted from pounds to kilograms.) The value of any correlation coefficient must be between -1 and +1 inclusive. Doesn't that mean the tookpak also uses the Correl function or something identical? What it does do is compare every sample in the spreadsheet to each other and place the results in a McGee utility like matrix. The results are actually a half-matrix. Anyone who has ever used the McGee utility knows how easy it is to spot related clusters in the full matrix. I think Bill's method uses the same technique. I have no comment on the validity of the approach,knowing nothing of statistics. John