RootsWeb.com Mailing Lists
Previous Page      Next Page
Total: 7800/10000
    1. Re: [R-M222] Odd Sample
    2. Iain Kennedy
    3. Did you really mean he tested at DNAH, are you sure it wasn't the 12 marker Genographic project? His number has the prefix N for Genographic, DNAH immigrants were supposed to have an H. Iain > From: Lochlan@aol.com > Date: Thu, 7 Jul 2011 19:16:00 -0400 > To: iarooster@earthlink.net; dna-r1b1c7@rootsweb.com > Subject: Re: [R-M222] Odd Sample > > > > In a message dated 7/7/2011 5:28:21 A.M. Central Daylight Time, > iarooster@earthlink.net writes: > > He's had the deep clade test at FTDNA or elsewhere? The reason I'm asking > is that I had someone deep clade tested at DNAH, and they didn't test for > M-222. Or at least they didn't at the time of this test several years ago. > > His SNP results are listed in his personal page at FTDNA which means they > did the testing. > > > John > > > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    07/08/2011 01:38:14
    1. Re: [R-M222] How is M222 defined?
    2. In a message dated 7/7/2011 7:00:26 A.M. Central Daylight Time, weh8@verizon.net writes: There has been considerable discussion both on- and off-line about how the M222 SNP is defined. First, I understand that its early definition depended on the first 12 markers. Next, we have the deep clade test of FTDNA with a proprietary approach we know little about. Next, there are discussions of how the markers agree or disagree with the modal values of the deep clade test, but only with respect to the first 12 markers of the FTDNA string. Bill, a few others have chimed in on various aspects of these assumptions but I have to add myself they are wrong. <First, I understand that its early definition depended on the first 12 markers. 25 markers was the norm at FTDNA when David Wilson first announced his discovery of what he called the NW Irish cluster. He was soon followed by the Trinity team with 17 markers who called the same cluster the Irish Modal Haplotype (IMH). <Next, we have the deep clade test of FTDNA with a proprietary approach we know little about. This has already been covered. It's not proprietory and details are readily available. <Next, there are discussions of how the markers agree or disagree with the modal values of the deep clade test, but only with respect to the first 12 markers of the FTDNA string. This has never been true anywhere that I know of and never in the M222 project. David Wilson has had a full modal haplotype online at Ysearch as long as I can remember. It is true that FTDNA awards their Nial icon based on an exact match at 12 markers. Perhaps this is what you're thinking of. David's M222 modal is M5UKQ on Ysearch. < I include now a table that shows the percentage of M222 testees that have mutations at the various points in the haplotype. For example, those with 454 had a constant value of the modal for 454, and less than 50% of the testees had the modal for the two CDYs. That's already been done. _http://clanmaclochlainn.com/R1b1c7/M222repeat.htm_ (http://clanmaclochlainn.com/R1b1c7/M222repeat.htm) If this thread continues much longer I'll attempt to get David Wilson back on board. He no longer is a member of the list but he surely will have some strong opinions on some of the statements made lately. He began the M222 project and was the one who made the decision that SNP testing was not necessary for inclusion in the project. Why? Because membership in the clade was easily detected via STRs alone. Pat Conroy threw out the Wilson sample as an example of a false M222 based on STRS. I'm glad someone else mentioned that. I have five times in the last month to no response. The Clark sample in the M222 negative section is almost identical. David Wilson listed these two as possibly pre M222. There is one other similar sample (Hannan) listed in the M222 negative section. He probably should go into the pre M222 section as well. The Wilson sample has an interesting history. David Wilson who was also the admin. of the Wilson DNA project. He assumed this sample was M222+ and was shocked to learn it was M222- by SNP test. At David's urging, FTDNA retested the sample several times. It came back negative every time. There are a few suspicious parts of both the Clark and Wilson samples. Both have 385 = 12-14 which is unusual in M222. The Clark sample at 464 has 15-15-16-17 at 464. That is highly unusual in M222. I see nothing unusual in the Hannan sample at all. None of these samples stand out as unusual in terms of genetic distance. Each has matchers within the project at 6,7-20 GDS which is no different from most other cluster members. That's no different from the Conroy sample. If I were presented with any of these samples and had to chose whether to admit them to the project based on STRS alone I probably would without an SNP test. I think David Wilson would do the same. So I will be the first to admit there could be some samples in the project which appear to be M222 but are not. The question is how many? What Bill has not mentioned to the list is a theory that the bi-modality he mentioned is caused by M222 project samples that are M222-. The twin humps his graph shows seems to be equally large on each side. That would seem to indicate there are roughly as many M222- in the project as M222+. Is that possible? Not long ago the count of project members with SNP tests was 334. That's out of 668 or about half the project. I can't guess how many of the non tested members might be M222-. Not too many is my guess. The way the project is currently organized shouldn't cause any confusion. FTDNA clearly indicates those with SNP tests. Those who only want to include them in analysis can do so easily. Dropping half the project without SNP tests would greatly reduce the surname diversity available. Is that a wise thing to do? I for one have no interest in getting an SNP test. I belong to a cluster of McLaughlins, four or five of whom have been SNP tested and are M222+. As far as I'm concerned an SNP test for me would be a waste of money. I won't spend the extra money just to keep the purists happy. On the other hand there could be a few samples in the project who are not M222 positive. It would be interesting to find out for sure. Right now we can only guess. That is a conundrum. John

    07/07/2011 04:16:31
    1. Re: [R-M222] How is M222 defined?
    2. Bill Howard
    3. Paul, Not so, and it depends what your goal is in the analysis. Operationally, groups of what you call M222+ and M222- are indistinguishable with respect to their marker sequences. Both groups have nearly the same average numbers of markers that match each marker modal. That being said, one can try to find the date of origin of the whole group, and of the two separately. If one finds dates that are statistically identical, then operationally the differences between the groups that have been M222-SNP tested and the ones that have not been tested lead to the same origin, just as long as the haplotypes share at least 27 of the 37 values with the modal values of their individual DYS markers. More broadly, one can compute the TMRCA for any group of haplotypes whether or not they belong to a particular group like M222. In the M222- example you cite, 37784 is not even in the M222- minus group. It is a red-herring since it shares only 25 markers with the modal values of the M222 group. It should be in the group of unknowns, not in the M222- group. I would not have considered it in the M222 group. To prove it, compare your sequence with the ones I gave in my initial posting of the question about how M222 is defined. The markers you cite below, namely 12, 14, 18, 9, 24, 19, 10, 16, 15, 17, 37, and 38 do not agree between your list of the Wilson haplotype and my list of modal values. Those 12 markers are different and only 25 are the same. Less than 67% of the markers match the modal values; my criterion was that at least 73% should match. Wilson is not even in the M222 minus group. I agree with your contention that the quip "Garbage in, Garbage out" is correct, but it doesn't apply here. Speaking of quips, when I taught at U. Michigan in the early days of programming, there was another expression that we called Kirk's Law (after a graduate student named Kirk who was enamored with programming): "Numbers that have been through a computer are worth more than numbers that haven't!" (grin). - Bye from Bill On Jul 7, 2011, at 8:22 PM, Paul Conroy wrote: > Bill, > > But you are mixing apples and oranges in your data (M222+ and M222-) - so > your results will not be accurate. > > Look at this M222- guy below, his STR's seem like M222+, yet he is not part > of that haplogroup. Your method would include people like this. > > M222- > 37784 Wilson Unknown Origin R1b1a2 13 25 14 11 12-14 12 12 12 13 14 > 29 18 9-9 11 11 24 15 19 30 15-16-16-17 10 11 19-23 16 15 17 17 37-38 12 12 > I work as a programmer, and we have a saying: "Garbage in, Garbage out" - > think about it. > > Cheers, > Paul > > On Thu, Jul 7, 2011 at 8:05 PM, Bill Howard <weh8@verizon.net> wrote: > >> Yes, David. I agree. And correspondence with John McLaughlin has shown that >> he is in agreement that any individual who is reasonably close to the R:M222 >> 37-marker modal is so likely to test positive for M222 that it may be a >> waste of money to do the test. >> >> My email that got this dialogue started was an attempt on my part to see >> how likely that might be. What I found was that if 27 or more of the 37 >> markers in a haplotype agreed with the modal value of the M222 group (which >> was self-defined by a large group of M222s), then it would be safe to >> conclude that the haplotype belonged to the M222 group. Granted, it was >> "guaranteed" by SNP testing, but the agreement between the haplotypes that >> WERE NOT SNP tested with those that were was so close as to be >> indistinguishable. Then I gave a set of those 'self defined' DYS modal >> values in my posting so that others would see what I meant and could do >> comparisons on their own. >> >> Which DYS modal values are more important? Obviously they will be the slow >> mutating markers for the far-ago progenitors and not the fast ones. The fast >> mutating markers can mutate back and forth so that they are only useful in >> comparing haplotypes that have a recent MRCA. In this way we can analyze the >> differences that define clusters on my phylogenetic tree and see which >> 'fingerprints' cause the differences among the clusters on the tree. >> >> On Jul 7, 2011, at 7:48 PM, David Ewing wrote: >> >>> R:M222 is 'defined' on the basis of the presence of the SNP M222. The >> definition has nothing to do with STR testing; ie, it does not depend at all >> on testing the markers we construct modals with. Iain correctly points out >> that David Wilson first identified a STR profile he thought was >> characteristic of NW Ireland that eventually proved to be associated with >> R:M222, but I do not believe that Wilson was aware of the SNP at that time >> and I do not think he tested for it. >>> >>> As it happens, since a SNP occurs in a single individual and is passed on >> to his descendants in perpetuity, and since he also passes his STR markers >> to his descendants and they will only gradually and slowly mutate away from >> the ancestral values through the generations, STR profiles of the >> descendants of the man who had the SNP will also be similar--very similar in >> the early generations, and then of gradually diminishing similarity as >> generations go on. >>> >>> The only way to be pure-d-double-L certain that an individual is in >> R:M222 is to test him for the M222 SNP. But any individual who is reasonably >> close to the R:M222 37-marker modal is so likely to test positive for M222 >> that it may be a waste of money to do the test. >>> >>> David Ewing >>> >>> On Thu, Jul 7, 2011 at 6:00 AM, Bill Howard <weh8@verizon.net> wrote: >>> There has been considerable discussion both on- and off-line about how >> the M222 SNP is defined. >>> First, I understand that its early definition depended on the first 12 >> markers. >>> Next, we have the deep clade test of FTDNA with a proprietary approach we >> know little about. >>> Next, there are discussions of how the markers agree or disagree with the >> modal values of the deep clade test, but only with respect to the first 12 >> markers of the FTDNA string. >>> >>> And now, here's my "take" on the situation. >>> I received from John McLaughlin a large set of markers that he noted were >> in the M222 group. Some had been SNP tested and some had not. >>> I did a study of ALL 37 markers (not just the first 12) and I determined >> the modal value of each DYS site. >>> I then went back and determined for EACH TESTEE the number of times each >> of his own particular markers matched the modal of that same marker for the >> M222 sample John sent me. >>> I then made a graph of the percentage of each testee's marker set that >> matched the overall marker set. >>> I found that virtually ALL markers in the testee set that John sent had >> 73% or more markers that agreed with the set of M222 modals -- not the first >> 12, but all 37 of them. >>> >>> The modal values I found for all 37 markers are the following, in the >> sequence given by FTDNA postings: >>> 13 25 14 11 11 13 12 12 12 >> 13 14 29 17 9 10 11 11 25 15 >> 18 30 15 16 16 17 11 11 19 >> 23 17 16 18 17 38 39 12 12 >>> >>> Of the 683 M222s in the group, all matched 73% of that sequence (at least >> 27 of the 37 markers). The average was 85.2% and both the median and the >> mode was 83.8%. One testee, 26917 (MacKenzie) matched 100% of the modals. >>> >>> I also found that if you made a testee plot of the number of markers that >> matched the M222 modal against their frequency of occurrence for all the 683 >> testees, the plot between 26 and 37 markers was bimodal, with two peaks. One >> peak was at 31 markers and the other peak was at 33 markers. A statistician >> might say that the departures from a Gaussian are not significant and that >> there are NOT two peaks, but I think it is arguable. When I do the same >> plot using 320 testees which are among a set with a larger number of >> SNP-tested testees, the bimodality is more pronounced but still >> statistically inconclusive. The two peaks are sharper and appear at the same >> place on the histogram. >>> >>> So, what do I conclude with all this? >>> First, that we cannot go by just the first 12 markers. We have more at >> our disposal to study. >>> Second, while we refer to the M222 SNP test of FTDNA, we realize that we >> take their results on faith about their criterion of who should be included >> in the M222 group. >>> Third, my analysis shows that you can safely (?) put a testee into the >> M222 group IF 73% or more of his 37 markers agree with the modal values of >> all 37 (not 12) markers. That is a practical working criterion for M222 >> inclusion in the group. I have given the modals, above, so now anyone can >> compare a haplotype with it and make your own conclusion. That criterion >> correlates well with FTDNA's M222 SNP-tested group. >>> >>> Now, we must realize that there are extreme variations in the mutation >> rates of the markers and that's why less than 100% of the testees are in the >> M222 group. The mutation rates vary by a factor of almost 400 between the >> fastest and slowest mutating DYS sites. Why does 26917 MacKenzie have a 100% >> match? Well, statistically, out of 683 testees whose markers are mutating >> over the time from the M222 progenitor to the present, you would expect one >> line not to vary at all, and that line has led to 26917 MacKenzie. In fact, >> his haplotype may provide a clue or a means to tease out some of the >> mutations that have taken place over time. That's an exercise still to be >> done. Now, when you have a set of fast to slow mutating DYS sites, you >> should be comparing the DIFFERENCES in marker values along the mutating >> lines. I include now a table that shows the percentage of M222 testees that >> have mutations at the various points in the haplotype. For example, those >> with 454 had a constant ! >> value of the modal for 454, and less than 50% of the testees had the modal >> for the two CDYs. >>> >>> DYS %Y >>> DYS454 100% >>> DYS426 99% >>> DYS388 99% >>> DYS459a 99% >>> YCAIIa 98% >>> DYS438 98% >>> DYS393 98% >>> DYS455 98% >>> DYS448 96% >>> DYS392 95% >>> DYS385a 95% >>> DYS459b 93% >>> DYS19 93% >>> DYS437 92% >>> DYS464a 90% >>> DYS442 90% >>> Y-GATA-H4 89% >>> DYS385b 88% >>> YCAIIb 88% >>> DYS389i 88% >>> DYS447 87% >>> DYS464b 87% >>> DYS464c 86% >>> DYS464d 85% >>> DYS390 85% >>> DYS607 85% >>> DYS391 83% >>> DYS389ii 80% >>> DYS439 79% >>> DYS570 77% >>> DYS458 74% >>> DYS449 71% >>> DYS460 70% >>> DYS456 68% >>> DYS576 58% >>> CDYb 46% >>> CDYa 42% >>> >>> Now, with the modal values, and with the table just above, you could >> analyze the slow moving markers among the haplotypes and see what happens. >> The fast moving markers are useful only for small values of RCC, whereas the >> slow moving markers will give insight about what was happening to the marker >> strings nearer the time of the progenitor - the higher values of RCC. >>> >>> So, my fourth conclusion is that the sequence of junctions on the >> phylogenetic tree, calibrated in terms of RCC values, will probably give >> valuable information not only on how the DNA clusters (which later evolve >> into surname groups) actually evolved over time but give us valuable >> fingerprints that differentiate one cluster from another (and at RCC values >> less than 20, the TMRCAs of the progenitor who was at the junction point >> that leads to different surnames. A clever programmer might help here! The >> data are available (!). >>> >>> - Bye from Bill Howard >>> >>> >>> >>> -- >>> Notice: This email is not secure, and is not for use by patients or for >> healthcare purposes in general. >>> >> >> R1b1c7 Research and Links: >> >> http://clanmaclochlainn.com/R1b1c7/ >> ------------------------------- >> To unsubscribe from the list, please send an email to >> DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the >> quotes in the subject and the body of the message >> > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    07/07/2011 03:39:42
    1. Re: [R-M222] How is M222 defined?
    2. Paul Conroy
    3. Bill, But you are mixing apples and oranges in your data (M222+ and M222-) - so your results will not be accurate. Look at this M222- guy below, his STR's seem like M222+, yet he is not part of that haplogroup. Your method would include people like this. M222- 37784 Wilson Unknown Origin R1b1a2 13 25 14 11 12-14 12 12 12 13 14 29 189-911112415193015-16-16-17101119-231615171737-381212 I work as a programmer, and we have a saying: "Garbage in, Garbage out" - think about it. Cheers, Paul On Thu, Jul 7, 2011 at 8:05 PM, Bill Howard <weh8@verizon.net> wrote: > Yes, David. I agree. And correspondence with John McLaughlin has shown that > he is in agreement that any individual who is reasonably close to the R:M222 > 37-marker modal is so likely to test positive for M222 that it may be a > waste of money to do the test. > > My email that got this dialogue started was an attempt on my part to see > how likely that might be. What I found was that if 27 or more of the 37 > markers in a haplotype agreed with the modal value of the M222 group (which > was self-defined by a large group of M222s), then it would be safe to > conclude that the haplotype belonged to the M222 group. Granted, it was > "guaranteed" by SNP testing, but the agreement between the haplotypes that > WERE NOT SNP tested with those that were was so close as to be > indistinguishable. Then I gave a set of those 'self defined' DYS modal > values in my posting so that others would see what I meant and could do > comparisons on their own. > > Which DYS modal values are more important? Obviously they will be the slow > mutating markers for the far-ago progenitors and not the fast ones. The fast > mutating markers can mutate back and forth so that they are only useful in > comparing haplotypes that have a recent MRCA. In this way we can analyze the > differences that define clusters on my phylogenetic tree and see which > 'fingerprints' cause the differences among the clusters on the tree. > > On Jul 7, 2011, at 7:48 PM, David Ewing wrote: > > > R:M222 is 'defined' on the basis of the presence of the SNP M222. The > definition has nothing to do with STR testing; ie, it does not depend at all > on testing the markers we construct modals with. Iain correctly points out > that David Wilson first identified a STR profile he thought was > characteristic of NW Ireland that eventually proved to be associated with > R:M222, but I do not believe that Wilson was aware of the SNP at that time > and I do not think he tested for it. > > > > As it happens, since a SNP occurs in a single individual and is passed on > to his descendants in perpetuity, and since he also passes his STR markers > to his descendants and they will only gradually and slowly mutate away from > the ancestral values through the generations, STR profiles of the > descendants of the man who had the SNP will also be similar--very similar in > the early generations, and then of gradually diminishing similarity as > generations go on. > > > > The only way to be pure-d-double-L certain that an individual is in > R:M222 is to test him for the M222 SNP. But any individual who is reasonably > close to the R:M222 37-marker modal is so likely to test positive for M222 > that it may be a waste of money to do the test. > > > > David Ewing > > > > On Thu, Jul 7, 2011 at 6:00 AM, Bill Howard <weh8@verizon.net> wrote: > > There has been considerable discussion both on- and off-line about how > the M222 SNP is defined. > > First, I understand that its early definition depended on the first 12 > markers. > > Next, we have the deep clade test of FTDNA with a proprietary approach we > know little about. > > Next, there are discussions of how the markers agree or disagree with the > modal values of the deep clade test, but only with respect to the first 12 > markers of the FTDNA string. > > > > And now, here's my "take" on the situation. > > I received from John McLaughlin a large set of markers that he noted were > in the M222 group. Some had been SNP tested and some had not. > > I did a study of ALL 37 markers (not just the first 12) and I determined > the modal value of each DYS site. > > I then went back and determined for EACH TESTEE the number of times each > of his own particular markers matched the modal of that same marker for the > M222 sample John sent me. > > I then made a graph of the percentage of each testee's marker set that > matched the overall marker set. > > I found that virtually ALL markers in the testee set that John sent had > 73% or more markers that agreed with the set of M222 modals -- not the first > 12, but all 37 of them. > > > > The modal values I found for all 37 markers are the following, in the > sequence given by FTDNA postings: > > 13 25 14 11 11 13 12 12 12 > 13 14 29 17 9 10 11 11 25 15 > 18 30 15 16 16 17 11 11 19 > 23 17 16 18 17 38 39 12 12 > > > > Of the 683 M222s in the group, all matched 73% of that sequence (at least > 27 of the 37 markers). The average was 85.2% and both the median and the > mode was 83.8%. One testee, 26917 (MacKenzie) matched 100% of the modals. > > > > I also found that if you made a testee plot of the number of markers that > matched the M222 modal against their frequency of occurrence for all the 683 > testees, the plot between 26 and 37 markers was bimodal, with two peaks. One > peak was at 31 markers and the other peak was at 33 markers. A statistician > might say that the departures from a Gaussian are not significant and that > there are NOT two peaks, but I think it is arguable. When I do the same > plot using 320 testees which are among a set with a larger number of > SNP-tested testees, the bimodality is more pronounced but still > statistically inconclusive. The two peaks are sharper and appear at the same > place on the histogram. > > > > So, what do I conclude with all this? > > First, that we cannot go by just the first 12 markers. We have more at > our disposal to study. > > Second, while we refer to the M222 SNP test of FTDNA, we realize that we > take their results on faith about their criterion of who should be included > in the M222 group. > > Third, my analysis shows that you can safely (?) put a testee into the > M222 group IF 73% or more of his 37 markers agree with the modal values of > all 37 (not 12) markers. That is a practical working criterion for M222 > inclusion in the group. I have given the modals, above, so now anyone can > compare a haplotype with it and make your own conclusion. That criterion > correlates well with FTDNA's M222 SNP-tested group. > > > > Now, we must realize that there are extreme variations in the mutation > rates of the markers and that's why less than 100% of the testees are in the > M222 group. The mutation rates vary by a factor of almost 400 between the > fastest and slowest mutating DYS sites. Why does 26917 MacKenzie have a 100% > match? Well, statistically, out of 683 testees whose markers are mutating > over the time from the M222 progenitor to the present, you would expect one > line not to vary at all, and that line has led to 26917 MacKenzie. In fact, > his haplotype may provide a clue or a means to tease out some of the > mutations that have taken place over time. That's an exercise still to be > done. Now, when you have a set of fast to slow mutating DYS sites, you > should be comparing the DIFFERENCES in marker values along the mutating > lines. I include now a table that shows the percentage of M222 testees that > have mutations at the various points in the haplotype. For example, those > with 454 had a constant ! > value of the modal for 454, and less than 50% of the testees had the modal > for the two CDYs. > > > > DYS %Y > > DYS454 100% > > DYS426 99% > > DYS388 99% > > DYS459a 99% > > YCAIIa 98% > > DYS438 98% > > DYS393 98% > > DYS455 98% > > DYS448 96% > > DYS392 95% > > DYS385a 95% > > DYS459b 93% > > DYS19 93% > > DYS437 92% > > DYS464a 90% > > DYS442 90% > > Y-GATA-H4 89% > > DYS385b 88% > > YCAIIb 88% > > DYS389i 88% > > DYS447 87% > > DYS464b 87% > > DYS464c 86% > > DYS464d 85% > > DYS390 85% > > DYS607 85% > > DYS391 83% > > DYS389ii 80% > > DYS439 79% > > DYS570 77% > > DYS458 74% > > DYS449 71% > > DYS460 70% > > DYS456 68% > > DYS576 58% > > CDYb 46% > > CDYa 42% > > > > Now, with the modal values, and with the table just above, you could > analyze the slow moving markers among the haplotypes and see what happens. > The fast moving markers are useful only for small values of RCC, whereas the > slow moving markers will give insight about what was happening to the marker > strings nearer the time of the progenitor - the higher values of RCC. > > > > So, my fourth conclusion is that the sequence of junctions on the > phylogenetic tree, calibrated in terms of RCC values, will probably give > valuable information not only on how the DNA clusters (which later evolve > into surname groups) actually evolved over time but give us valuable > fingerprints that differentiate one cluster from another (and at RCC values > less than 20, the TMRCAs of the progenitor who was at the junction point > that leads to different surnames. A clever programmer might help here! The > data are available (!). > > > > - Bye from Bill Howard > > > > > > > > -- > > Notice: This email is not secure, and is not for use by patients or for > healthcare purposes in general. > > > > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to > DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the > quotes in the subject and the body of the message >

    07/07/2011 02:22:06
    1. Re: [R-M222] How is M222 defined?
    2. Bill Howard
    3. Yes, David. I agree. And correspondence with John McLaughlin has shown that he is in agreement that any individual who is reasonably close to the R:M222 37-marker modal is so likely to test positive for M222 that it may be a waste of money to do the test. My email that got this dialogue started was an attempt on my part to see how likely that might be. What I found was that if 27 or more of the 37 markers in a haplotype agreed with the modal value of the M222 group (which was self-defined by a large group of M222s), then it would be safe to conclude that the haplotype belonged to the M222 group. Granted, it was "guaranteed" by SNP testing, but the agreement between the haplotypes that WERE NOT SNP tested with those that were was so close as to be indistinguishable. Then I gave a set of those 'self defined' DYS modal values in my posting so that others would see what I meant and could do comparisons on their own. Which DYS modal values are more important? Obviously they will be the slow mutating markers for the far-ago progenitors and not the fast ones. The fast mutating markers can mutate back and forth so that they are only useful in comparing haplotypes that have a recent MRCA. In this way we can analyze the differences that define clusters on my phylogenetic tree and see which 'fingerprints' cause the differences among the clusters on the tree. On Jul 7, 2011, at 7:48 PM, David Ewing wrote: > R:M222 is 'defined' on the basis of the presence of the SNP M222. The definition has nothing to do with STR testing; ie, it does not depend at all on testing the markers we construct modals with. Iain correctly points out that David Wilson first identified a STR profile he thought was characteristic of NW Ireland that eventually proved to be associated with R:M222, but I do not believe that Wilson was aware of the SNP at that time and I do not think he tested for it. > > As it happens, since a SNP occurs in a single individual and is passed on to his descendants in perpetuity, and since he also passes his STR markers to his descendants and they will only gradually and slowly mutate away from the ancestral values through the generations, STR profiles of the descendants of the man who had the SNP will also be similar--very similar in the early generations, and then of gradually diminishing similarity as generations go on. > > The only way to be pure-d-double-L certain that an individual is in R:M222 is to test him for the M222 SNP. But any individual who is reasonably close to the R:M222 37-marker modal is so likely to test positive for M222 that it may be a waste of money to do the test. > > David Ewing > > On Thu, Jul 7, 2011 at 6:00 AM, Bill Howard <weh8@verizon.net> wrote: > There has been considerable discussion both on- and off-line about how the M222 SNP is defined. > First, I understand that its early definition depended on the first 12 markers. > Next, we have the deep clade test of FTDNA with a proprietary approach we know little about. > Next, there are discussions of how the markers agree or disagree with the modal values of the deep clade test, but only with respect to the first 12 markers of the FTDNA string. > > And now, here's my "take" on the situation. > I received from John McLaughlin a large set of markers that he noted were in the M222 group. Some had been SNP tested and some had not. > I did a study of ALL 37 markers (not just the first 12) and I determined the modal value of each DYS site. > I then went back and determined for EACH TESTEE the number of times each of his own particular markers matched the modal of that same marker for the M222 sample John sent me. > I then made a graph of the percentage of each testee's marker set that matched the overall marker set. > I found that virtually ALL markers in the testee set that John sent had 73% or more markers that agreed with the set of M222 modals -- not the first 12, but all 37 of them. > > The modal values I found for all 37 markers are the following, in the sequence given by FTDNA postings: > 13 25 14 11 11 13 12 12 12 13 14 29 17 9 10 11 11 25 15 18 30 15 16 16 17 11 11 19 23 17 16 18 17 38 39 12 12 > > Of the 683 M222s in the group, all matched 73% of that sequence (at least 27 of the 37 markers). The average was 85.2% and both the median and the mode was 83.8%. One testee, 26917 (MacKenzie) matched 100% of the modals. > > I also found that if you made a testee plot of the number of markers that matched the M222 modal against their frequency of occurrence for all the 683 testees, the plot between 26 and 37 markers was bimodal, with two peaks. One peak was at 31 markers and the other peak was at 33 markers. A statistician might say that the departures from a Gaussian are not significant and that there are NOT two peaks, but I think it is arguable. When I do the same plot using 320 testees which are among a set with a larger number of SNP-tested testees, the bimodality is more pronounced but still statistically inconclusive. The two peaks are sharper and appear at the same place on the histogram. > > So, what do I conclude with all this? > First, that we cannot go by just the first 12 markers. We have more at our disposal to study. > Second, while we refer to the M222 SNP test of FTDNA, we realize that we take their results on faith about their criterion of who should be included in the M222 group. > Third, my analysis shows that you can safely (?) put a testee into the M222 group IF 73% or more of his 37 markers agree with the modal values of all 37 (not 12) markers. That is a practical working criterion for M222 inclusion in the group. I have given the modals, above, so now anyone can compare a haplotype with it and make your own conclusion. That criterion correlates well with FTDNA's M222 SNP-tested group. > > Now, we must realize that there are extreme variations in the mutation rates of the markers and that's why less than 100% of the testees are in the M222 group. The mutation rates vary by a factor of almost 400 between the fastest and slowest mutating DYS sites. Why does 26917 MacKenzie have a 100% match? Well, statistically, out of 683 testees whose markers are mutating over the time from the M222 progenitor to the present, you would expect one line not to vary at all, and that line has led to 26917 MacKenzie. In fact, his haplotype may provide a clue or a means to tease out some of the mutations that have taken place over time. That's an exercise still to be done. Now, when you have a set of fast to slow mutating DYS sites, you should be comparing the DIFFERENCES in marker values along the mutating lines. I include now a table that shows the percentage of M222 testees that have mutations at the various points in the haplotype. For example, those with 454 had a constant value of the modal for 454, and less than 50% of the testees had the modal for the two CDYs. > > DYS %Y > DYS454 100% > DYS426 99% > DYS388 99% > DYS459a 99% > YCAIIa 98% > DYS438 98% > DYS393 98% > DYS455 98% > DYS448 96% > DYS392 95% > DYS385a 95% > DYS459b 93% > DYS19 93% > DYS437 92% > DYS464a 90% > DYS442 90% > Y-GATA-H4 89% > DYS385b 88% > YCAIIb 88% > DYS389i 88% > DYS447 87% > DYS464b 87% > DYS464c 86% > DYS464d 85% > DYS390 85% > DYS607 85% > DYS391 83% > DYS389ii 80% > DYS439 79% > DYS570 77% > DYS458 74% > DYS449 71% > DYS460 70% > DYS456 68% > DYS576 58% > CDYb 46% > CDYa 42% > > Now, with the modal values, and with the table just above, you could analyze the slow moving markers among the haplotypes and see what happens. The fast moving markers are useful only for small values of RCC, whereas the slow moving markers will give insight about what was happening to the marker strings nearer the time of the progenitor - the higher values of RCC. > > So, my fourth conclusion is that the sequence of junctions on the phylogenetic tree, calibrated in terms of RCC values, will probably give valuable information not only on how the DNA clusters (which later evolve into surname groups) actually evolved over time but give us valuable fingerprints that differentiate one cluster from another (and at RCC values less than 20, the TMRCAs of the progenitor who was at the junction point that leads to different surnames. A clever programmer might help here! The data are available (!). > > - Bye from Bill Howard > > > > -- > Notice: This email is not secure, and is not for use by patients or for healthcare purposes in general. >

    07/07/2011 02:05:56
    1. Re: [R-M222] How is M222 defined?
    2. In a message dated 7/7/2011 6:48:39 P.M. Central Daylight Time, davidewing93@gmail.com writes: R:M222 is 'defined' on the basis of the presence of the SNP M222. The definition has nothing to do with STR testing; ie, it does not depend at all on testing the markers we construct modals with. Iain correctly points out that David Wilson first identified a STR profile he thought was characteristic of NW Ireland that eventually proved to be associated with R:M222, but I do not believe that Wilson was aware of the SNP at that time and I do not think he tested for it. I can only add the SNP had been known at the time in academic circles but not routinely tested for when David Wilson announced his finding of the NW Irish cluster. As I recall it was associated with sterility in some studies, since repudiated. David was the first of the cluster to get tested for the SNP though.

    07/07/2011 02:01:26
    1. Re: [R-M222] Odd Sample
    2. Paul Conroy
    3. This is an example of exactly what I was saying in the other thread - people need to be SNP tested to determine if they are M222+ or not. We can't just rely on STR's alone. My question is, how many other people who tested with FTDNA, and who do not "appear" to be M222+ - based on STR's - are actually M222+ On Thu, Jul 7, 2011 at 7:29 PM, <Lochlan@aol.com> wrote: > By the way, the 12 marker Turner sample is being upgraded to 37 markers. > We'll see how the rest of the markers match up soon. > > > > John > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to > DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the > quotes in the subject and the body of the message >

    07/07/2011 01:51:18
    1. Re: [R-M222] How is M222 defined?
    2. Sandy Paterson
    3. This is an interesting one. There are Doherty's that seem closely related, as well as Daugherty's that seem closely related. I need a closer look, but at this stage it looks like two different sources of similar surnames rather than an SNP. Unless, of course it's a very early SNP (as opposed to a recent SNP). -----Original Message----- From: dna-r1b1c7-bounces@rootsweb.com [mailto:dna-r1b1c7-bounces@rootsweb.com] On Behalf Of Sandy Paterson Sent: 07 July 2011 19:09 To: dna-r1b1c7@rootsweb.com Subject: Re: [R-M222] How is M222 defined? I see Doherty 31706 (SNP untested) is a GD of 21 from Daugherty 73633 (who has been tested) over 111 markers. I don't know their haplotypes that well, but I can't see how anyone could regard them as being closely related. I'll have a look at their closest non-Doherty/Daugherty matches over 67 markers. -----Original Message----- From: dna-r1b1c7-bounces@rootsweb.com [mailto:dna-r1b1c7-bounces@rootsweb.com] On Behalf Of Paul Conroy Sent: 07 July 2011 16:03 To: dna-r1b1c7@rootsweb.com Subject: Re: [R-M222] How is M222 defined? Allene, Yes, if these closely related members share the exact same STR values as the tested member, but if they don't then Deep Clade testing is called for. The problem is that the area of Ireland, Britain or France or elsewhere, where M222 first arose, there are going to exist M222+ and M222- individuals, who probably have very similar STR values - so a Non Paternal Event (NPE) in this community would still give similar STR values to biological parent, and could be miscounted as being M222+ What I'm specifically referring to though is the huge Doherty and other projects, where only a handful of participants have been tested for M222+, and the majority are assumed to belong, based solely on similar STR values. Cheers, Paul On Thu, Jul 7, 2011 at 10:56 AM, Allene Goforth <agoforth@moscow.com> wrote: > Paul, > > I would say that only ONE member of a close cluster like my five > MacAdam/McAdam lines needs to take the Deep Clade test. I'm in a bit of > a hurry right now, but I know that FTDNA has stated more than once that > it isn't necessary to spend all that money on separate Deep Clade tests > for a group that's obviously related. > > Allene > > > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to > DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the > quotes in the subject and the body of the message > R1b1c7 Research and Links: http://clanmaclochlainn.com/R1b1c7/ ------------------------------- To unsubscribe from the list, please send an email to DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message R1b1c7 Research and Links: http://clanmaclochlainn.com/R1b1c7/ ------------------------------- To unsubscribe from the list, please send an email to DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    07/07/2011 01:42:07
    1. Re: [R-M222] Odd Sample
    2. By the way, the 12 marker Turner sample is being upgraded to 37 markers. We'll see how the rest of the markers match up soon. John

    07/07/2011 01:29:57
    1. Re: [R-M222] Odd Sample
    2. In a message dated 7/7/2011 5:28:21 A.M. Central Daylight Time, iarooster@earthlink.net writes: He's had the deep clade test at FTDNA or elsewhere? The reason I'm asking is that I had someone deep clade tested at DNAH, and they didn't test for M-222. Or at least they didn't at the time of this test several years ago. His SNP results are listed in his personal page at FTDNA which means they did the testing. John

    07/07/2011 01:16:00
    1. Re: [R-M222] How is M222 defined?
    2. Sandy Paterson
    3. I see Doherty 31706 (SNP untested) is a GD of 21 from Daugherty 73633 (who has been tested) over 111 markers. I don't know their haplotypes that well, but I can't see how anyone could regard them as being closely related. I'll have a look at their closest non-Doherty/Daugherty matches over 67 markers. -----Original Message----- From: dna-r1b1c7-bounces@rootsweb.com [mailto:dna-r1b1c7-bounces@rootsweb.com] On Behalf Of Paul Conroy Sent: 07 July 2011 16:03 To: dna-r1b1c7@rootsweb.com Subject: Re: [R-M222] How is M222 defined? Allene, Yes, if these closely related members share the exact same STR values as the tested member, but if they don't then Deep Clade testing is called for. The problem is that the area of Ireland, Britain or France or elsewhere, where M222 first arose, there are going to exist M222+ and M222- individuals, who probably have very similar STR values - so a Non Paternal Event (NPE) in this community would still give similar STR values to biological parent, and could be miscounted as being M222+ What I'm specifically referring to though is the huge Doherty and other projects, where only a handful of participants have been tested for M222+, and the majority are assumed to belong, based solely on similar STR values. Cheers, Paul On Thu, Jul 7, 2011 at 10:56 AM, Allene Goforth <agoforth@moscow.com> wrote: > Paul, > > I would say that only ONE member of a close cluster like my five > MacAdam/McAdam lines needs to take the Deep Clade test. I'm in a bit of > a hurry right now, but I know that FTDNA has stated more than once that > it isn't necessary to spend all that money on separate Deep Clade tests > for a group that's obviously related. > > Allene > > > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to > DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the > quotes in the subject and the body of the message > R1b1c7 Research and Links: http://clanmaclochlainn.com/R1b1c7/ ------------------------------- To unsubscribe from the list, please send an email to DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    07/07/2011 01:08:43
    1. Re: [R-M222] How is M222 defined?
    2. Iain Kennedy
    3. In the interests of giving credit where it is due, the M222 SNP test was first commercially offered by Jim Wilson's Ethnoancestry company in Scotland. I did my M222 test there soon after it was launched and before FTDNA went to market. Here is the original announcement: http://archiver.rootsweb.ancestry.com/th/read/genealogy-dna/2006-03/1141526628 Ethnoancestry still sell the test (although Faux is no longer involved with them). Iain http://www.kennedydna.com > From: weh8@verizon.net > Date: Thu, 7 Jul 2011 08:00:21 -0400 > To: dna-r1b1c7@rootsweb.com > CC: davidewing93@gmail.com; JJLNV@comcast.net; wathey@hprg.com > Subject: [R-M222] How is M222 defined? > > There has been considerable discussion both on- and off-line about how the M222 SNP is defined. > First, I understand that its early definition depended on the first 12 markers. > Next, we have the deep clade test of FTDNA with a proprietary approach we know little about. > Next, there are discussions of how the markers agree or disagree with the modal values of the deep clade test, but only with respect to the first 12 markers of the FTDNA string. > > And now, here's my "take" on the situation. > I received from John McLaughlin a large set of markers that he noted were in the M222 group. Some had been SNP tested and some had not. > I did a study of ALL 37 markers (not just the first 12) and I determined the modal value of each DYS site. > I then went back and determined for EACH TESTEE the number of times each of his own particular markers matched the modal of that same marker for the M222 sample John sent me. > I then made a graph of the percentage of each testee's marker set that matched the overall marker set. > I found that virtually ALL markers in the testee set that John sent had 73% or more markers that agreed with the set of M222 modals -- not the first 12, but all 37 of them. > > The modal values I found for all 37 markers are the following, in the sequence given by FTDNA postings: > 13 25 14 11 11 13 12 12 12 13 14 29 17 9 10 11 11 25 15 18 30 15 16 16 17 11 11 19 23 17 16 18 17 38 39 12 12 > > Of the 683 M222s in the group, all matched 73% of that sequence (at least 27 of the 37 markers). The average was 85.2% and both the median and the mode was 83.8%. One testee, 26917 (MacKenzie) matched 100% of the modals. > > I also found that if you made a testee plot of the number of markers that matched the M222 modal against their frequency of occurrence for all the 683 testees, the plot between 26 and 37 markers was bimodal, with two peaks. One peak was at 31 markers and the other peak was at 33 markers. A statistician might say that the departures from a Gaussian are not significant and that there are NOT two peaks, but I think it is arguable. When I do the same plot using 320 testees which are among a set with a larger number of SNP-tested testees, the bimodality is more pronounced but still statistically inconclusive. The two peaks are sharper and appear at the same place on the histogram. > > So, what do I conclude with all this? > First, that we cannot go by just the first 12 markers. We have more at our disposal to study. > Second, while we refer to the M222 SNP test of FTDNA, we realize that we take their results on faith about their criterion of who should be included in the M222 group. > Third, my analysis shows that you can safely (?) put a testee into the M222 group IF 73% or more of his 37 markers agree with the modal values of all 37 (not 12) markers. That is a practical working criterion for M222 inclusion in the group. I have given the modals, above, so now anyone can compare a haplotype with it and make your own conclusion. That criterion correlates well with FTDNA's M222 SNP-tested group. > > Now, we must realize that there are extreme variations in the mutation rates of the markers and that's why less than 100% of the testees are in the M222 group. The mutation rates vary by a factor of almost 400 between the fastest and slowest mutating DYS sites. Why does 26917 MacKenzie have a 100% match? Well, statistically, out of 683 testees whose markers are mutating over the time from the M222 progenitor to the present, you would expect one line not to vary at all, and that line has led to 26917 MacKenzie. In fact, his haplotype may provide a clue or a means to tease out some of the mutations that have taken place over time. That's an exercise still to be done. Now, when you have a set of fast to slow mutating DYS sites, you should be comparing the DIFFERENCES in marker values along the mutating lines. I include now a table that shows the percentage of M222 testees that have mutations at the various points in the haplotype. For example, those with 454 had a constant va! > lue of the modal for 454, and less than 50% of the testees had the modal for the two CDYs. > > DYS %Y > DYS454 100% > DYS426 99% > DYS388 99% > DYS459a 99% > YCAIIa 98% > DYS438 98% > DYS393 98% > DYS455 98% > DYS448 96% > DYS392 95% > DYS385a 95% > DYS459b 93% > DYS19 93% > DYS437 92% > DYS464a 90% > DYS442 90% > Y-GATA-H4 89% > DYS385b 88% > YCAIIb 88% > DYS389i 88% > DYS447 87% > DYS464b 87% > DYS464c 86% > DYS464d 85% > DYS390 85% > DYS607 85% > DYS391 83% > DYS389ii 80% > DYS439 79% > DYS570 77% > DYS458 74% > DYS449 71% > DYS460 70% > DYS456 68% > DYS576 58% > CDYb 46% > CDYa 42% > > Now, with the modal values, and with the table just above, you could analyze the slow moving markers among the haplotypes and see what happens. The fast moving markers are useful only for small values of RCC, whereas the slow moving markers will give insight about what was happening to the marker strings nearer the time of the progenitor - the higher values of RCC. > > So, my fourth conclusion is that the sequence of junctions on the phylogenetic tree, calibrated in terms of RCC values, will probably give valuable information not only on how the DNA clusters (which later evolve into surname groups) actually evolved over time but give us valuable fingerprints that differentiate one cluster from another (and at RCC values less than 20, the TMRCAs of the progenitor who was at the junction point that leads to different surnames. A clever programmer might help here! The data are available (!). > > - Bye from Bill Howard > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    07/07/2011 01:01:51
    1. Re: [R-M222] How is M222 defined?
    2. David Ewing
    3. R:M222 is 'defined' on the basis of the presence of the SNP M222. The definition has nothing to do with STR testing; ie, it does not depend at all on testing the markers we construct modals with. Iain correctly points out that David Wilson first identified a STR profile he thought was characteristic of NW Ireland that eventually proved to be associated with R:M222, but I do not believe that Wilson was aware of the SNP at that time and I do not think he tested for it. As it happens, since a SNP occurs in a single individual and is passed on to his descendants in perpetuity, and since he also passes his STR markers to his descendants and they will only gradually and slowly mutate away from the ancestral values through the generations, STR profiles of the descendants of the man who had the SNP will also be similar--very similar in the early generations, and then of gradually diminishing similarity as generations go on. The only way to be pure-d-double-L certain that an individual is in R:M222 is to test him for the M222 SNP. But any individual who is reasonably close to the R:M222 37-marker modal is so likely to test positive for M222 that it may be a waste of money to do the test. David Ewing On Thu, Jul 7, 2011 at 6:00 AM, Bill Howard <weh8@verizon.net> wrote: > There has been considerable discussion both on- and off-line about how the > M222 SNP is defined. > First, I understand that its early definition depended on the first 12 > markers. > Next, we have the deep clade test of FTDNA with a proprietary approach we > know little about. > Next, there are discussions of how the markers agree or disagree with the > modal values of the deep clade test, but only with respect to the first 12 > markers of the FTDNA string. > > And now, here's my "take" on the situation. > I received from John McLaughlin a large set of markers that he noted were > in the M222 group. Some had been SNP tested and some had not. > I did a study of ALL 37 markers (not just the first 12) and I determined > the modal value of each DYS site. > I then went back and determined for EACH TESTEE the number of times each of > his own particular markers matched the modal of that same marker for the > M222 sample John sent me. > I then made a graph of the percentage of each testee's marker set that > matched the overall marker set. > I found that virtually ALL markers in the testee set that John sent had 73% > or more markers that agreed with the set of M222 modals -- not the first 12, > but all 37 of them. > > The modal values I found for all 37 markers are the following, in the > sequence given by FTDNA postings: > 13 25 14 11 11 13 12 12 12 13 > 14 29 17 9 10 11 11 25 15 > 18 30 15 16 16 17 11 11 19 23 > 17 16 18 17 38 39 12 12 > > Of the 683 M222s in the group, all matched 73% of that sequence (at least > 27 of the 37 markers). The average was 85.2% and both the median and the > mode was 83.8%. One testee, 26917 (MacKenzie) matched 100% of the modals. > > I also found that if you made a testee plot of the number of markers that > matched the M222 modal against their frequency of occurrence for all the 683 > testees, the plot between 26 and 37 markers was bimodal, with two peaks. One > peak was at 31 markers and the other peak was at 33 markers. A statistician > might say that the departures from a Gaussian are not significant and that > there are NOT two peaks, but I think it is arguable. When I do the same > plot using 320 testees which are among a set with a larger number of > SNP-tested testees, the bimodality is more pronounced but still > statistically inconclusive. The two peaks are sharper and appear at the same > place on the histogram. > > So, what do I conclude with all this? > First, that we cannot go by just the first 12 markers. We have more at our > disposal to study. > Second, while we refer to the M222 SNP test of FTDNA, we realize that we > take their results on faith about their criterion of who should be included > in the M222 group. > Third, my analysis shows that you can safely (?) put a testee into the M222 > group IF 73% or more of his 37 markers agree with the modal values of all 37 > (not 12) markers. That is a practical working criterion for M222 inclusion > in the group. I have given the modals, above, so now anyone can compare a > haplotype with it and make your own conclusion. That criterion correlates > well with FTDNA's M222 SNP-tested group. > > Now, we must realize that there are extreme variations in the mutation > rates of the markers and that's why less than 100% of the testees are in the > M222 group. The mutation rates vary by a factor of almost 400 between the > fastest and slowest mutating DYS sites. Why does 26917 MacKenzie have a 100% > match? Well, statistically, out of 683 testees whose markers are mutating > over the time from the M222 progenitor to the present, you would expect one > line not to vary at all, and that line has led to 26917 MacKenzie. In fact, > his haplotype may provide a clue or a means to tease out some of the > mutations that have taken place over time. That's an exercise still to be > done. Now, when you have a set of fast to slow mutating DYS sites, you > should be comparing the DIFFERENCES in marker values along the mutating > lines. I include now a table that shows the percentage of M222 testees that > have mutations at the various points in the haplotype. For example, those > with 454 had a constant value of the modal for 454, and less than 50% of the > testees had the modal for the two CDYs. > > DYS %Y > DYS454 100% > DYS426 99% > DYS388 99% > DYS459a 99% > YCAIIa 98% > DYS438 98% > DYS393 98% > DYS455 98% > DYS448 96% > DYS392 95% > DYS385a 95% > DYS459b 93% > DYS19 93% > DYS437 92% > DYS464a 90% > DYS442 90% > Y-GATA-H4 89% > DYS385b 88% > YCAIIb 88% > DYS389i 88% > DYS447 87% > DYS464b 87% > DYS464c 86% > DYS464d 85% > DYS390 85% > DYS607 85% > DYS391 83% > DYS389ii 80% > DYS439 79% > DYS570 77% > DYS458 74% > DYS449 71% > DYS460 70% > DYS456 68% > DYS576 58% > CDYb 46% > CDYa 42% > > Now, with the modal values, and with the table just above, you could > analyze the slow moving markers among the haplotypes and see what happens. > The fast moving markers are useful only for small values of RCC, whereas the > slow moving markers will give insight about what was happening to the marker > strings nearer the time of the progenitor - the higher values of RCC. > > So, my fourth conclusion is that the sequence of junctions on the > phylogenetic tree, calibrated in terms of RCC values, will probably give > valuable information not only on how the DNA clusters (which later evolve > into surname groups) actually evolved over time but give us valuable > fingerprints that differentiate one cluster from another (and at RCC values > less than 20, the TMRCAs of the progenitor who was at the junction point > that leads to different surnames. A clever programmer might help here! The > data are available (!). > > - Bye from Bill Howard -- Notice: This email is not secure, and is not for use by patients or for healthcare purposes in general.

    07/07/2011 11:48:37
    1. Re: [R-M222] Odd Sample
    2. J David Grierson
    3. Interesting, John. More testing is essential from the big-picture POV, he might add quite a lot to the discussion about age of M222. David On 7/07/2011 11:30 AM, Lochlan@aol.com wrote: > We just had an interesting sample join the M222 project, from DNA Heritage, > surname is Turner from Newcastle which I presume is in England. This is > one of the few samples I've seen as administrator that does not appear to > be M222+ based on STR values yet tested M222+ on the deep clade test. It's > only a 12 marker test (I'll encourage him to get more tested). The values > are: > > 13-24-14-11-11-12-12-12-12-13-13-29 > > He's missing the M222 modal at almost every spot in the first 12 markers > which is highly unusual. > > I doubt much can be said about the surname Turner. MacLysaght calls it > English or Scots, in Scotland, possibly Mac an tuineir. > > > > John > > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message > > > ----- > No virus found in this message. > Checked by AVG - www.avg.com > Version: 10.0.1388 / Virus Database: 1516/3746 - Release Date: 07/05/11 > >

    07/07/2011 11:26:22
    1. Re: [R-M222] How is M222 defined?
    2. Bernard Morgan
    3. There would be valuable information in sub-dividing the related 'North-West Irish' kinship. Presenting such possibles as defining some Dal Cuinn lines as M222+ and others as M222-. > That's exactly what I have advocated for years already - all project members > should be SNP tested! I think the project should be divided into 3 groups: > 1. M222+ > 2. M222- > 3. Unknown - i.e. not tested

    07/07/2011 11:14:31
    1. Re: [R-M222] How is M222 defined?
    2. Paul Conroy
    3. Allene, Yes, if these closely related members share the exact same STR values as the tested member, but if they don't then Deep Clade testing is called for. The problem is that the area of Ireland, Britain or France or elsewhere, where M222 first arose, there are going to exist M222+ and M222- individuals, who probably have very similar STR values - so a Non Paternal Event (NPE) in this community would still give similar STR values to biological parent, and could be miscounted as being M222+ What I'm specifically referring to though is the huge Doherty and other projects, where only a handful of participants have been tested for M222+, and the majority are assumed to belong, based solely on similar STR values. Cheers, Paul On Thu, Jul 7, 2011 at 10:56 AM, Allene Goforth <agoforth@moscow.com> wrote: > Paul, > > I would say that only ONE member of a close cluster like my five > MacAdam/McAdam lines needs to take the Deep Clade test. I'm in a bit of > a hurry right now, but I know that FTDNA has stated more than once that > it isn't necessary to spend all that money on separate Deep Clade tests > for a group that's obviously related. > > Allene > > > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to > DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the > quotes in the subject and the body of the message >

    07/07/2011 05:03:17
    1. Re: [R-M222] How is M222 defined?
    2. Paul Conroy
    3. Stephen, That's exactly what I have advocated for years already - all project members should be SNP tested! I think the project should be divided into 3 groups: 1. M222+ 2. M222- 3. Unknown - i.e. not tested For calculations about the age of M222+, only group # 1 should be used. For calculations about the origin of M222-, group # 2 is valuable - and many of them may be DF23. However group # 3, while of interest to some, should NOT be used in M222 calculations! Cheers, Paul On Thu, Jul 7, 2011 at 10:15 AM, Stephen Forrest <stephen.forrest@gmail.com>wrote: > I think some careful use of nomenclature is called for here. There is no > ambiguity about how the M222 SNP is defined... it is a point mutation from > G > to A at a particular position on the Y chromosome (see > http://www.snpedia.com/index.php/Rs20321) and has nothing to do with STRs. > > I realize you know this and what we're really debating is the definition of > the 'North-West Irish' variety: i.e. are men who have the SNP but not the > STR signature, or the STR signature but not the SNP, included? My point is > just that when debating the definition of something there's no point in > calling it by the name of something already well-defined (the M222 SNP). > Call it the "North-West Irish" group if you like and then go on with the > debate. > > Incidentally, there's an ongoing discussion on dna-forums.org somewhat > relevant to this discussion. I mentioned last week that there is a > recently-identified SNP upstream of M222, called DF23. Some STR results > for > DF23+ M222- men have been released and they are DYS392=14 and DYS390=25, > meaning they share a couple markers with their downstream cousins. This > suggests that there are M222- men out there with North-West STR signatures, > and they might try ordering DF23 when it's available from FTDNA. > > best, > > Steve > > On 7 July 2011 08:00, Bill Howard <weh8@verizon.net> wrote: > > > There has been considerable discussion both on- and off-line about how > the > > M222 SNP is defined. > > First, I understand that its early definition depended on the first 12 > > markers. > > Next, we have the deep clade test of FTDNA with a proprietary approach we > > know little about. > > Next, there are discussions of how the markers agree or disagree with the > > modal values of the deep clade test, but only with respect to the first > 12 > > markers of the FTDNA string. > > > > And now, here's my "take" on the situation. > > I received from John McLaughlin a large set of markers that he noted were > > in the M222 group. Some had been SNP tested and some had not. > > I did a study of ALL 37 markers (not just the first 12) and I determined > > the modal value of each DYS site. > > I then went back and determined for EACH TESTEE the number of times each > of > > his own particular markers matched the modal of that same marker for the > > M222 sample John sent me. > > I then made a graph of the percentage of each testee's marker set that > > matched the overall marker set. > > I found that virtually ALL markers in the testee set that John sent had > 73% > > or more markers that agreed with the set of M222 modals -- not the first > 12, > > but all 37 of them. > > > > The modal values I found for all 37 markers are the following, in the > > sequence given by FTDNA postings: > > 13 25 14 11 11 13 12 12 12 > 13 > > 14 29 17 9 10 11 11 25 15 > > 18 30 15 16 16 17 11 11 19 > 23 > > 17 16 18 17 38 39 12 12 > > > > Of the 683 M222s in the group, all matched 73% of that sequence (at least > > 27 of the 37 markers). The average was 85.2% and both the median and the > > mode was 83.8%. One testee, 26917 (MacKenzie) matched 100% of the modals. > > > > I also found that if you made a testee plot of the number of markers that > > matched the M222 modal against their frequency of occurrence for all the > 683 > > testees, the plot between 26 and 37 markers was bimodal, with two peaks. > One > > peak was at 31 markers and the other peak was at 33 markers. A > statistician > > might say that the departures from a Gaussian are not significant and > that > > there are NOT two peaks, but I think it is arguable. When I do the same > > plot using 320 testees which are among a set with a larger number of > > SNP-tested testees, the bimodality is more pronounced but still > > statistically inconclusive. The two peaks are sharper and appear at the > same > > place on the histogram. > > > > So, what do I conclude with all this? > > First, that we cannot go by just the first 12 markers. We have more at > our > > disposal to study. > > Second, while we refer to the M222 SNP test of FTDNA, we realize that we > > take their results on faith about their criterion of who should be > included > > in the M222 group. > > Third, my analysis shows that you can safely (?) put a testee into the > M222 > > group IF 73% or more of his 37 markers agree with the modal values of all > 37 > > (not 12) markers. That is a practical working criterion for M222 > inclusion > > in the group. I have given the modals, above, so now anyone can compare a > > haplotype with it and make your own conclusion. That criterion correlates > > well with FTDNA's M222 SNP-tested group. > > > > Now, we must realize that there are extreme variations in the mutation > > rates of the markers and that's why less than 100% of the testees are in > the > > M222 group. The mutation rates vary by a factor of almost 400 between the > > fastest and slowest mutating DYS sites. Why does 26917 MacKenzie have a > 100% > > match? Well, statistically, out of 683 testees whose markers are mutating > > over the time from the M222 progenitor to the present, you would expect > one > > line not to vary at all, and that line has led to 26917 MacKenzie. In > fact, > > his haplotype may provide a clue or a means to tease out some of the > > mutations that have taken place over time. That's an exercise still to be > > done. Now, when you have a set of fast to slow mutating DYS sites, you > > should be comparing the DIFFERENCES in marker values along the mutating > > lines. I include now a table that shows the percentage of M222 testees > that > > have mutations at the various points in the haplotype. For example, those > > with 454 had a constant va! > > lue of the modal for 454, and less than 50% of the testees had the modal > > for the two CDYs. > > > > DYS %Y > > DYS454 100% > > DYS426 99% > > DYS388 99% > > DYS459a 99% > > YCAIIa 98% > > DYS438 98% > > DYS393 98% > > DYS455 98% > > DYS448 96% > > DYS392 95% > > DYS385a 95% > > DYS459b 93% > > DYS19 93% > > DYS437 92% > > DYS464a 90% > > DYS442 90% > > Y-GATA-H4 89% > > DYS385b 88% > > YCAIIb 88% > > DYS389i 88% > > DYS447 87% > > DYS464b 87% > > DYS464c 86% > > DYS464d 85% > > DYS390 85% > > DYS607 85% > > DYS391 83% > > DYS389ii 80% > > DYS439 79% > > DYS570 77% > > DYS458 74% > > DYS449 71% > > DYS460 70% > > DYS456 68% > > DYS576 58% > > CDYb 46% > > CDYa 42% > > > > Now, with the modal values, and with the table just above, you could > > analyze the slow moving markers among the haplotypes and see what > happens. > > The fast moving markers are useful only for small values of RCC, whereas > the > > slow moving markers will give insight about what was happening to the > marker > > strings nearer the time of the progenitor - the higher values of RCC. > > > > So, my fourth conclusion is that the sequence of junctions on the > > phylogenetic tree, calibrated in terms of RCC values, will probably give > > valuable information not only on how the DNA clusters (which later evolve > > into surname groups) actually evolved over time but give us valuable > > fingerprints that differentiate one cluster from another (and at RCC > values > > less than 20, the TMRCAs of the progenitor who was at the junction point > > that leads to different surnames. A clever programmer might help here! > The > > data are available (!). > > > > - Bye from Bill Howard > > R1b1c7 Research and Links: > > > > http://clanmaclochlainn.com/R1b1c7/ > > ------------------------------- > > To unsubscribe from the list, please send an email to > > DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the > > quotes in the subject and the body of the message > > > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to > DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the > quotes in the subject and the body of the message >

    07/07/2011 04:35:08
    1. Re: [R-M222] How is M222 defined?
    2. Stephen Forrest
    3. I think some careful use of nomenclature is called for here. There is no ambiguity about how the M222 SNP is defined... it is a point mutation from G to A at a particular position on the Y chromosome (see http://www.snpedia.com/index.php/Rs20321) and has nothing to do with STRs. I realize you know this and what we're really debating is the definition of the 'North-West Irish' variety: i.e. are men who have the SNP but not the STR signature, or the STR signature but not the SNP, included? My point is just that when debating the definition of something there's no point in calling it by the name of something already well-defined (the M222 SNP). Call it the "North-West Irish" group if you like and then go on with the debate. Incidentally, there's an ongoing discussion on dna-forums.org somewhat relevant to this discussion. I mentioned last week that there is a recently-identified SNP upstream of M222, called DF23. Some STR results for DF23+ M222- men have been released and they are DYS392=14 and DYS390=25, meaning they share a couple markers with their downstream cousins. This suggests that there are M222- men out there with North-West STR signatures, and they might try ordering DF23 when it's available from FTDNA. best, Steve On 7 July 2011 08:00, Bill Howard <weh8@verizon.net> wrote: > There has been considerable discussion both on- and off-line about how the > M222 SNP is defined. > First, I understand that its early definition depended on the first 12 > markers. > Next, we have the deep clade test of FTDNA with a proprietary approach we > know little about. > Next, there are discussions of how the markers agree or disagree with the > modal values of the deep clade test, but only with respect to the first 12 > markers of the FTDNA string. > > And now, here's my "take" on the situation. > I received from John McLaughlin a large set of markers that he noted were > in the M222 group. Some had been SNP tested and some had not. > I did a study of ALL 37 markers (not just the first 12) and I determined > the modal value of each DYS site. > I then went back and determined for EACH TESTEE the number of times each of > his own particular markers matched the modal of that same marker for the > M222 sample John sent me. > I then made a graph of the percentage of each testee's marker set that > matched the overall marker set. > I found that virtually ALL markers in the testee set that John sent had 73% > or more markers that agreed with the set of M222 modals -- not the first 12, > but all 37 of them. > > The modal values I found for all 37 markers are the following, in the > sequence given by FTDNA postings: > 13 25 14 11 11 13 12 12 12 13 > 14 29 17 9 10 11 11 25 15 > 18 30 15 16 16 17 11 11 19 23 > 17 16 18 17 38 39 12 12 > > Of the 683 M222s in the group, all matched 73% of that sequence (at least > 27 of the 37 markers). The average was 85.2% and both the median and the > mode was 83.8%. One testee, 26917 (MacKenzie) matched 100% of the modals. > > I also found that if you made a testee plot of the number of markers that > matched the M222 modal against their frequency of occurrence for all the 683 > testees, the plot between 26 and 37 markers was bimodal, with two peaks. One > peak was at 31 markers and the other peak was at 33 markers. A statistician > might say that the departures from a Gaussian are not significant and that > there are NOT two peaks, but I think it is arguable. When I do the same > plot using 320 testees which are among a set with a larger number of > SNP-tested testees, the bimodality is more pronounced but still > statistically inconclusive. The two peaks are sharper and appear at the same > place on the histogram. > > So, what do I conclude with all this? > First, that we cannot go by just the first 12 markers. We have more at our > disposal to study. > Second, while we refer to the M222 SNP test of FTDNA, we realize that we > take their results on faith about their criterion of who should be included > in the M222 group. > Third, my analysis shows that you can safely (?) put a testee into the M222 > group IF 73% or more of his 37 markers agree with the modal values of all 37 > (not 12) markers. That is a practical working criterion for M222 inclusion > in the group. I have given the modals, above, so now anyone can compare a > haplotype with it and make your own conclusion. That criterion correlates > well with FTDNA's M222 SNP-tested group. > > Now, we must realize that there are extreme variations in the mutation > rates of the markers and that's why less than 100% of the testees are in the > M222 group. The mutation rates vary by a factor of almost 400 between the > fastest and slowest mutating DYS sites. Why does 26917 MacKenzie have a 100% > match? Well, statistically, out of 683 testees whose markers are mutating > over the time from the M222 progenitor to the present, you would expect one > line not to vary at all, and that line has led to 26917 MacKenzie. In fact, > his haplotype may provide a clue or a means to tease out some of the > mutations that have taken place over time. That's an exercise still to be > done. Now, when you have a set of fast to slow mutating DYS sites, you > should be comparing the DIFFERENCES in marker values along the mutating > lines. I include now a table that shows the percentage of M222 testees that > have mutations at the various points in the haplotype. For example, those > with 454 had a constant va! > lue of the modal for 454, and less than 50% of the testees had the modal > for the two CDYs. > > DYS %Y > DYS454 100% > DYS426 99% > DYS388 99% > DYS459a 99% > YCAIIa 98% > DYS438 98% > DYS393 98% > DYS455 98% > DYS448 96% > DYS392 95% > DYS385a 95% > DYS459b 93% > DYS19 93% > DYS437 92% > DYS464a 90% > DYS442 90% > Y-GATA-H4 89% > DYS385b 88% > YCAIIb 88% > DYS389i 88% > DYS447 87% > DYS464b 87% > DYS464c 86% > DYS464d 85% > DYS390 85% > DYS607 85% > DYS391 83% > DYS389ii 80% > DYS439 79% > DYS570 77% > DYS458 74% > DYS449 71% > DYS460 70% > DYS456 68% > DYS576 58% > CDYb 46% > CDYa 42% > > Now, with the modal values, and with the table just above, you could > analyze the slow moving markers among the haplotypes and see what happens. > The fast moving markers are useful only for small values of RCC, whereas the > slow moving markers will give insight about what was happening to the marker > strings nearer the time of the progenitor - the higher values of RCC. > > So, my fourth conclusion is that the sequence of junctions on the > phylogenetic tree, calibrated in terms of RCC values, will probably give > valuable information not only on how the DNA clusters (which later evolve > into surname groups) actually evolved over time but give us valuable > fingerprints that differentiate one cluster from another (and at RCC values > less than 20, the TMRCAs of the progenitor who was at the junction point > that leads to different surnames. A clever programmer might help here! The > data are available (!). > > - Bye from Bill Howard > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to > DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the > quotes in the subject and the body of the message >

    07/07/2011 04:15:42
    1. Re: [R-M222] How is M222 defined?
    2. Allene Goforth
    3. Hi Paul, I can see where you are coming from with the Dohertys. I don't place much faith on merely sharing the same surname, and the Mac/McAdam clan as a whole certainly isn't all M222. My little band doesn't share the exact same STR values. That's setting the bar very high. Sometimes not even father/son are exact matches. I think Thomas Krahn would tell me not to waste my money on any more Deep Clade tests for that bunch. Allene On 7/7/2011 8:03 AM, Paul Conroy wrote: > Allene, > > Yes, if these closely related members share the exact same STR values as the > tested member, but if they don't then Deep Clade testing is called for. > > The problem is that the area of Ireland, Britain or France or elsewhere, > where M222 first arose, there are going to exist M222+ and M222- > individuals, who probably have very similar STR values - so a Non Paternal > Event (NPE) in this community would still give similar STR values to > biological parent, and could be miscounted as being M222+ > > What I'm specifically referring to though is the huge Doherty and other > projects, where only a handful of participants have been tested for M222+, > and the majority are assumed to belong, based solely on similar STR values. > > Cheers, > Paul > > On Thu, Jul 7, 2011 at 10:56 AM, Allene Goforth<agoforth@moscow.com> wrote: > >> Paul, >> >> I would say that only ONE member of a close cluster like my five >> MacAdam/McAdam lines needs to take the Deep Clade test. I'm in a bit of >> a hurry right now, but I know that FTDNA has stated more than once that >> it isn't necessary to spend all that money on separate Deep Clade tests >> for a group that's obviously related. >> >> Allene >> >> >> R1b1c7 Research and Links: >> >> http://clanmaclochlainn.com/R1b1c7/ >> ------------------------------- >> To unsubscribe from the list, please send an email to >> DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the >> quotes in the subject and the body of the message >> > R1b1c7 Research and Links: > > http://clanmaclochlainn.com/R1b1c7/ > ------------------------------- > To unsubscribe from the list, please send an email to DNA-R1B1C7-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    07/07/2011 02:27:51
    1. [R-M222] How is M222 defined?
    2. Bill Howard
    3. There has been considerable discussion both on- and off-line about how the M222 SNP is defined. First, I understand that its early definition depended on the first 12 markers. Next, we have the deep clade test of FTDNA with a proprietary approach we know little about. Next, there are discussions of how the markers agree or disagree with the modal values of the deep clade test, but only with respect to the first 12 markers of the FTDNA string. And now, here's my "take" on the situation. I received from John McLaughlin a large set of markers that he noted were in the M222 group. Some had been SNP tested and some had not. I did a study of ALL 37 markers (not just the first 12) and I determined the modal value of each DYS site. I then went back and determined for EACH TESTEE the number of times each of his own particular markers matched the modal of that same marker for the M222 sample John sent me. I then made a graph of the percentage of each testee's marker set that matched the overall marker set. I found that virtually ALL markers in the testee set that John sent had 73% or more markers that agreed with the set of M222 modals -- not the first 12, but all 37 of them. The modal values I found for all 37 markers are the following, in the sequence given by FTDNA postings: 13 25 14 11 11 13 12 12 12 13 14 29 17 9 10 11 11 25 15 18 30 15 16 16 17 11 11 19 23 17 16 18 17 38 39 12 12 Of the 683 M222s in the group, all matched 73% of that sequence (at least 27 of the 37 markers). The average was 85.2% and both the median and the mode was 83.8%. One testee, 26917 (MacKenzie) matched 100% of the modals. I also found that if you made a testee plot of the number of markers that matched the M222 modal against their frequency of occurrence for all the 683 testees, the plot between 26 and 37 markers was bimodal, with two peaks. One peak was at 31 markers and the other peak was at 33 markers. A statistician might say that the departures from a Gaussian are not significant and that there are NOT two peaks, but I think it is arguable. When I do the same plot using 320 testees which are among a set with a larger number of SNP-tested testees, the bimodality is more pronounced but still statistically inconclusive. The two peaks are sharper and appear at the same place on the histogram. So, what do I conclude with all this? First, that we cannot go by just the first 12 markers. We have more at our disposal to study. Second, while we refer to the M222 SNP test of FTDNA, we realize that we take their results on faith about their criterion of who should be included in the M222 group. Third, my analysis shows that you can safely (?) put a testee into the M222 group IF 73% or more of his 37 markers agree with the modal values of all 37 (not 12) markers. That is a practical working criterion for M222 inclusion in the group. I have given the modals, above, so now anyone can compare a haplotype with it and make your own conclusion. That criterion correlates well with FTDNA's M222 SNP-tested group. Now, we must realize that there are extreme variations in the mutation rates of the markers and that's why less than 100% of the testees are in the M222 group. The mutation rates vary by a factor of almost 400 between the fastest and slowest mutating DYS sites. Why does 26917 MacKenzie have a 100% match? Well, statistically, out of 683 testees whose markers are mutating over the time from the M222 progenitor to the present, you would expect one line not to vary at all, and that line has led to 26917 MacKenzie. In fact, his haplotype may provide a clue or a means to tease out some of the mutations that have taken place over time. That's an exercise still to be done. Now, when you have a set of fast to slow mutating DYS sites, you should be comparing the DIFFERENCES in marker values along the mutating lines. I include now a table that shows the percentage of M222 testees that have mutations at the various points in the haplotype. For example, those with 454 had a constant va! lue of the modal for 454, and less than 50% of the testees had the modal for the two CDYs. DYS %Y DYS454 100% DYS426 99% DYS388 99% DYS459a 99% YCAIIa 98% DYS438 98% DYS393 98% DYS455 98% DYS448 96% DYS392 95% DYS385a 95% DYS459b 93% DYS19 93% DYS437 92% DYS464a 90% DYS442 90% Y-GATA-H4 89% DYS385b 88% YCAIIb 88% DYS389i 88% DYS447 87% DYS464b 87% DYS464c 86% DYS464d 85% DYS390 85% DYS607 85% DYS391 83% DYS389ii 80% DYS439 79% DYS570 77% DYS458 74% DYS449 71% DYS460 70% DYS456 68% DYS576 58% CDYb 46% CDYa 42% Now, with the modal values, and with the table just above, you could analyze the slow moving markers among the haplotypes and see what happens. The fast moving markers are useful only for small values of RCC, whereas the slow moving markers will give insight about what was happening to the marker strings nearer the time of the progenitor - the higher values of RCC. So, my fourth conclusion is that the sequence of junctions on the phylogenetic tree, calibrated in terms of RCC values, will probably give valuable information not only on how the DNA clusters (which later evolve into surname groups) actually evolved over time but give us valuable fingerprints that differentiate one cluster from another (and at RCC values less than 20, the TMRCAs of the progenitor who was at the junction point that leads to different surnames. A clever programmer might help here! The data are available (!). - Bye from Bill Howard

    07/07/2011 02:00:21