RootsWeb.com Mailing Lists
Total: 2/2
    1. [R-M222] M222 Tree
    2. Sandy Paterson
    3. Yesterday I posted two tables showing what I got for RCC's between Ewing 26605 and 61 other 111-marker testees, over 37, 67 and 111 markers. The first table was wrong; I am happy that the 2nd table is correct. The second table suggests that the RCC's for Ewing 26605 are over-stated over 37 markers compared to both 67 marker results and 111-marker results. The trouble is that the comparison involved only one Ewing. So what I then did was to calculate RCC's over 37 and 67 markers for all of the M222 Ewings I have on file with 67-marker tests. There are 19. Each of the 19 Ewings were compared to all of the M222 non-Ewings, a total of 19 x 552 RCC calculations. The results are summarised as follows: Mean RCC over 37 markers 42.62 Mean RCC over 67 markers 35.91 I haven't done a hypothesis test, but I'd suggest that the difference is significant, and suggests strongly that RCC's between Ewing and non-Ewing (and hence TMRCA's) are over-stated over 37 markers. I haven't yet had time to check this for other surnames, but I may get round to it later this morning. The other concern I have about the exercise, is that in my opinion, any method of TMRCA estimation that ignores the difference in impact between rare matches (off-modal matches) and common matches (on-modal matches), is seriously flawed. My website is finally fully functional, so I can now refer to http://www.tmrca.com/?page_id=11 where this is discussed. Lookups of estimated TMRCA's can be done by navigating to the section called 'Live Lookups'. At this stage, I consider only single-step mutations of one step up and one step down, with probabilities of m/2 each, where m is the assumed (marker-specific) mutation rate. I've started working on allowing for 2-step mutations however, since empirical evidence is starting to appear suggesting that about 3.5% of all Y-STR mutations are multi-step. Sandy

    07/10/2011 03:04:10
    1. Re: [R-M222] M222 Tree
    2. Bill Howard
    3. Sandy, I have suggested a number of times that you should send me your spreadsheet where you have the details of comparisons like this one and you have not done so. I have also suggested, off-line, that you approached the earlier haplotype analysis in the wrong way -- the one in which you compared a few haplotypes with only one Ewing. I showed even then that if you compare the differences between your 37, 67 and 111 RCCs, assigned defendable standard deviations to both your comparisons AND the single Ewing comparison haplotype, you get differences among your results that are within about one standard deviation of the average of the three runs. My calculation shows that, using your own data, your original conclusion, claiming that they were different, was incorrect. Then I suggested that you compare each haplotype string with each of ALL the other haplotype strings -- an easy job if you use a correlation app in any statistical application. You did this, apparently, in your note below but only with a few more Ewings. That was not what I suggested. You admitted that your original approach was wrong, and I proved, using your own figures, that if you assign reasonable SDs to the data, that the three results are statistically the same. Now, you did not take my suggestion to do a comparison over all the haplogroups, but compared them with a few more Ewings. That is still not the right thing to do. You should be comparing all the haplotypes with each other. 19 Ewings are not enough. You did not calculate any value for the standard deviations of the result you got. Few statisticians would say that the numbers are different without an assignment of SDs to the differences you found. Again, I challenge you, publicly now, to send me your data in a format I can use, and describe your method. The devil is often in the details of the way this is done -- your handling of the zero values, the way the Ewings were selected and their small number, etc. Also, your contention is wrong when you state that any method of TMRCA estimation that ignores the difference in impact between rare matches (off-modal matches) and common matches (on-modal matches), is seriously flawed. Here you are certainly comparing apples with oranges. A correlation coefficient merely estimates the degree of difference between pairs of haplotypes and the calibration considers ALL the marker values, including the ones you say are on- and off-modal. In addition, modal values are worthless in my RCC approach, anyway. They have very little value. The RCC time scale ignores them, and rightly so, since a modal is a mathematical construct that is virtual at best when it comes to the evolution of a progenitors haplotype (which is NOT the modal in most instances), down the various lines to the present group of testees. I do not see that your website is relevant to this discussion. Finally, about the association of genetic distance (GD) with RCC -- I have run many strings of haplotypes and have changed various marker values by 1, 2, 3, and compared many sets with each other. They show that a change of 1 in GD can cause a change in RCC of about 3, depending on which marker (low vs high) is changed. Table 1 in my published paper in the JoGG <http://mysite.verizon.net/weh8/Howard1.pdf> confirms those more extensive calculations. I stand by this association of GD with RCC and maintain that RCC contains more valuable information because it applies to every marker value, not just citing how many of them have changed. Send me your methodology and the details of your results, Sandy. Until then, my priorities will have to be elsewhere. With best regards, - Bye from Bill Howard On Jul 10, 2011, at 4:04 AM, Sandy Paterson wrote: > Yesterday I posted two tables showing what I got for RCC's between Ewing > 26605 and 61 other 111-marker testees, over 37, 67 and 111 markers. The > first table was wrong; I am happy that the 2nd table is correct. The second > table suggests that the RCC's for Ewing 26605 are over-stated over 37 > markers compared to both 67 marker results and 111-marker results. > > The trouble is that the comparison involved only one Ewing. > > So what I then did was to calculate RCC's over 37 and 67 markers for all of > the M222 Ewings I have on file with 67-marker tests. There are 19. Each of > the 19 Ewings were compared to all of the M222 non-Ewings, a total of 19 x > 552 RCC calculations. The results are summarised as follows: > > Mean RCC over 37 markers 42.62 > Mean RCC over 67 markers 35.91 > > I haven't done a hypothesis test, but I'd suggest that the difference is > significant, and suggests strongly that RCC's between Ewing and non-Ewing > (and hence TMRCA's) are over-stated over 37 markers. I haven't yet had time > to check this for other surnames, but I may get round to it later this > morning. > > The other concern I have about the exercise, is that in my opinion, any > method of TMRCA estimation that ignores the difference in impact between > rare matches (off-modal matches) and common matches (on-modal matches), is > seriously flawed. > > My website is finally fully functional, so I can now refer to > > http://www.tmrca.com/?page_id=11 > > where this is discussed. Lookups of estimated TMRCA's can be done by > navigating to the section called 'Live Lookups'. > > At this stage, I consider only single-step mutations of one step up and one > step down, with probabilities of m/2 each, where m is the assumed > (marker-specific) mutation rate. I've started working on allowing for 2-step > mutations however, since empirical evidence is starting to appear suggesting > that about 3.5% of all Y-STR mutations are multi-step. > > > Sandy > > > > > > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    07/10/2011 04:53:48