Richard, > -----Original Message----- > From: y-dna-projects-bounces@rootsweb.com [mailto:y-dna-projects- > bounces@rootsweb.com] On Behalf Of RT > Sent: Wednesday, August 11, 2010 1:59 PM > To: y-dna-projects@rootsweb.com; rt-sails@comcast.net > Subject: Re: [Y-DNA-projects] WAMH vis-a-vis CMA > > I can't say anything specific, other than that there's a HUGE possible range for > TMRCA. I agree with you that the confidence intervals on TMRCA's are huge, so huge that, IMO, they are useless for genealogical purposes, as I discuss on this web page: http://dgmweb.net/DNA/y-dna-projects/TMRCA.shtml > This article by Ken Nordtvedt is very relevant, but people have tended to ignore it. I > feel it presents a mathematical basis for the "purely logical" process many people use > of looking at off-modal markers to identify a lineage. > http://www.jogg.info/42/files/Nordtvedt.htm However, I would take exception to your statement that a "mathematical" (i.e., statistical) basis is stronger than a logical one. The situation is quite the reverse. If you can arrive at a conclusion based on established facts and a valid logical deduction, it's a far stronger proof than a statistical one. As a simple example... Logical argument: A is taller than B. B is taller than C. Therefore, A must be taller than C. Statistical argument: A has a 90% probability of being taller than B. B has a 90% probability of being taller than C. Therefore, A is probably taller than C, but might not be. Cladistic analysis is a matter of deducing the polarity of traits (ancestral vs. derived), then finding the most logical order of their appearance to form a cladogram (a phylogenetic tree). The most common method of determining the polarity of traits is through outgroup comparison, though there are other ways. There is a fundamental difference here between trying to construct a haplotree based on SNP mutations and one based on STR mutations, mainly because the polarity of a SNP mutation is much easier to deduce. They are easier to deduce because they usually have only one of two states and they are relatively rare. In contrast, it's difficult to determine the polarity of an STR mutation because, not only are they relatively common (so choosing the right outgroup is difficult), they can have many states (e.g., just because someone is 12 at a marker doesn't mean the ancestral value was 11 -- it might have been 13), and reversals are largely undetectable. It is possible to build a useful STR cladogram for individual families in genealogical time because the paper genealogy can tell you the polarity of the mutations, provided you can test enough cousins to "triangulate" on the location of all the mutations in the family. Ken is using (or appears to me to be using) statistical haplotype "resemblance" to form his groups, without reference to trait polarity, which means he is not engaged in cladistics and his trees are not cladograms, except when confined to SNPs alone. (Ken and I have been arguing this point *for years*, both on GENEALOGY-DNA and on Y-DNA-HAPLOGROUP-I.) You may find his statistical results useful, and I do, but statistical results based on resemblance cannot be as reliably true as a logical cladistic analysis would be. Diana
Thank you for the insights, Diana You mention "It is possible to build a useful STR cladogram for individual families in genealogical time because the paper genealogy can tell you the polarity of the mutations, provided you can test enough cousins to "triangulate" on the location of all the mutations in the family. " That is, provided that the paper genealogy can tell you the polarity of the mutations. This is very often not the case in real life, and I assume it is not the case in Ralph's example. He did not suggest a logical cladistic analysis. Re "However, I would take exception to your statement that a "mathematical" (i.e., statistical) basis is stronger than a logical one. The situation is quite the reverse, " I didn't actually say that. Rather, (clearly) both are helpful. But the more obscure mathematical treatment has been ignored by most. In particular, I think that you are implicitly relying on statistics in your logical argument. It's OK, no big deal. thanks! Richard Thrift ---- Diana Gale Matthiesen <DianaGM@dgmweb.net> wrote: Richard, > -----Original Message----- > From: y-dna-projects-bounces@rootsweb.com [mailto:y-dna-projects- > bounces@rootsweb.com] On Behalf Of RT > Sent: Wednesday, August 11, 2010 1:59 PM > To: y-dna-projects@rootsweb.com; rt-sails@comcast.net > Subject: Re: [Y-DNA-projects] WAMH vis-a-vis CMA > > I can't say anything specific, other than that there's a HUGE possible range for > TMRCA. I agree with you that the confidence intervals on TMRCA's are huge, so huge that, IMO, they are useless for genealogical purposes, as I discuss on this web page: http://dgmweb.net/DNA/y-dna-projects/TMRCA.shtml > This article by Ken Nordtvedt is very relevant, but people have tended to ignore it. I > feel it presents a mathematical basis for the "purely logical" process many people use > of looking at off-modal markers to identify a lineage. > http://www.jogg.info/42/files/Nordtvedt.htm However, I would take exception to your statement that a "mathematical" (i.e., statistical) basis is stronger than a logical one. The situation is quite the reverse. If you can arrive at a conclusion based on established facts and a valid logical deduction, it's a far stronger proof than a statistical one. As a simple example... Logical argument: A is taller than B. B is taller than C. Therefore, A must be taller than C. Statistical argument: A has a 90% probability of being taller than B. B has a 90% probability of being taller than C. Therefore, A is probably taller than C, but might not be. Cladistic analysis is a matter of deducing the polarity of traits (ancestral vs. derived), then finding the most logical order of their appearance to form a cladogram (a phylogenetic tree). The most common method of determining the polarity of traits is through outgroup comparison, though there are other ways. There is a fundamental difference here between trying to construct a haplotree based on SNP mutations and one based on STR mutations, mainly because the polarity of a SNP mutation is much easier to deduce. They are easier to deduce because they usually have only one of two states and they are relatively rare. In contrast, it's difficult to determine the polarity of an STR mutation because, not only are they relatively common (so choosing the right outgroup is difficult), they can have many states (e.g., just because someone is 12 at a marker doesn't mean the ancestral value was 11 -- it might have been 13), and reversals are largely undetectable. It is possible to build a useful STR cladogram for individual families in genealogical time because the paper genealogy can tell you the polarity of the mutations, provided you can test enough cousins to "triangulate" on the location of all the mutations in the family. Ken is using (or appears to me to be using) statistical haplotype "resemblance" to form his groups, without reference to trait polarity, which means he is not engaged in cladistics and his trees are not cladograms, except when confined to SNPs alone. (Ken and I have been arguing this point *for years*, both on GENEALOGY-DNA and on Y-DNA-HAPLOGROUP-I.) You may find his statistical results useful, and I do, but statistical results based on resemblance cannot be as reliably true as a logical cladistic analysis would be. Diana