RootsWeb.com Mailing Lists
Total: 2/2
    1. Re: [Y-DNA-projects] Y-DNA-PROJECTS Making matches, crossing the Pond
    2. Diana Gale Matthiesen
    3. I said at the beginning of my last message that these were largely matters of opinion and that it goes without saying that each admin is free to run their project as they choose. We have both given our (differing) opinions, so I'm not certain anything is to be gained by giving them again. This is not a matter of either of us proving the other is right or wrong. I hope we can understand each other better, while still agreeing to disagree. > From: Debbie Kennett > Sent: Thursday, January 27, 2011 7:18 PM > > Diana > > I previously cited on this list the RecLoh in the Aldous DNA > project: > > http://www.familytreedna.com/public/Aldous/default.aspx?section=yres > ults > > At first glance the two results don't match at all, and > presumably you would not count this as match. Neither man > shows up in the other person's list of matches. This is > an example where the TiP is particularly useful. The TiP > gives the two men a 74.32% chance of sharing a common ancestor > within 24 generations. On paper the two men supposedly > share a common ancestor in the 1400s. This result is > therefore within the bounds of probability, but it is > not clear-cut. Interestingly, despite this result being > somewhat extreme, it's not too far off James Irvine's 80% > TiP probability which he uses as a cut-off in the Irvine > project. Actually, your example proves just the opposite. The reason you are taking a second look at a match with this low a probability (and 74% is *very* low) is that their paper genealogy connects them. If these men had different surnames (did not remotely connect on paper), you would not give their 74% chance of a connection in 24 generations a second look. Personally, I would be withholding judgment, entirely, until both tested 67 markers. > > I agree that everything gets more difficult as you go > > further back, but when the paper records run out, my > > job is finished. It has never been my intention to > > use DNA testing to do more than support good paper > > genealogy, debunk bad paper genealogy, and break > > through brick walls. I have no intention whatsoever > > of taking my genealogy or my members' genealogies > > back beyond surname adoption. > > I'm not talking about taking genealogies back before > the adoption of surnames, I'm talking about deciding > whether or not results are related *since* the > adoption of surnames. The records often run out or > are incomplete for the first five hundred or so years > after the adoption of surnames. Are you therefore > discounting matches where surnames could potentially > be related in the 1100-1600 period where very few > people have paper trails? Yes, I am discounting them. I'm not interested in "potential" relatives that predate paper genealogy. I'm a genealogist, and I'm using DNA testing to support/debunk/advance paper genealogy. There is a point when genealogy can go no further, and I'm not interested in going beyond that point. If my members are interested, they have the right to pursue the question, themselves. <snip> > How are you deciding the criteria for a match? You > must have a cut-off point somewhere which is > effectively a rule of thumb. Initially, I followed the FTDNA guidelines: http://www.familytreedna.com/genetic-distance-markers.aspx?testtype=37 http://www.familytreedna.com/genetic-distance-markers.aspx?testtype=67 After years of running six projects, I have enough empirical evidence to have my own guidelines (see last section at bottom of page): http://dgmweb.net/DNA/y-dna-projects/TMRCA.html > Mutations occur at > random and don't necessarily follow logical > patterns. The best we can do is study large bodies > of data and see the range of possibilities that > might occur and then make decisions based on all > the available evidence. Results do not always give > a straightforward yes-no answer. The RecLoh result > cited above in the Aldous project is a case in point. Your example in the ALDOUS project is extraordinary. I'd want to upgrade to 67 markers and test more than just two family members before I was ready to decide what was going on here. For example, are there any "in-betweeners"? You need to start mapping the appearance of these mutations with a cladogram as I am doing with this CORBIN family: http://dgmweb.net/DNA/Corbin/NodeChart-JohnCorbin-RichmondCoVA.html Most cases do give an obvious Yes-No answer, especially for Americans connecting to their immigrants -- and, yes, those are the people I'm largely dealing with. I've never looked at their TiP calculations, before, but I just did, and my "families" (subgroups) are giving probabilities in the neighborhood of 98-100% in 24 generations. The statistical threshold in science is 95%. You won't get a good journal to even look at your paper, much less put it through the review process, with probabilities less than that. It's a big world, and coincidences do happen. > >"Genealogical time" is not something you can > > arbitrarily define in terms of years or generations. > > It's defined for each family by when the paper > > records run out, which happens earlier for some > > families than it does for others. > > We obviously share different definitions of genealogical > time. I have some lines where the genealogical records > can't be traced back before the 1800s. However their DNA > matches clearly place them within a specific tree which > has been well researched, even if the link in the paper > trail record cannot be found. I have no hesitation in > adding them to their respective genetic families despite > the lack of a paper trail. I regard genealogical time as > the time when genealogical records containing surnames > start to become available. This varies from one culture > to another. For my purposes researching an English > surname the records begin in the 1100s. The earliest > occurrence of the surname Cruwys/Crues dates from 1160. I didn't say I stopped when the *individual's* paper trail runs out, I said I stop when the *family's* paper trail runs out. If the DNA evidence is strong enough, I will connect people to a family, even if the paper connection remains unknown. The question then becomes the level of acceptance. You are using a statistical probability based on a calculation that treats mutations as happening at a constant rate. I'm using an empirically derived threshold that allows for the wide swing in the random occurrence of mutations. What are those thresholds? Experience (empirical evidence) has shows that when people are connected within 9 to 12 generations, as are my CORBINs http://dgmweb.net/DNA/Corbin/CorbinDNA-results-HgI1.html#AS2 my STRAUBs http://dgmweb.net/DNA/Straub/StraubDNA-results-HgI1-AS5.html#data and my CARRICOs http://dgmweb.net/DNA/Carrico/CarricoDNA-results-HgJ2a4b.html#data their genetic distances range from 0 to 3, in this rank of frequency: 1, 0, 2, 3. I am using this genetic distance in combination with the sharing of signature markers to subgroup members: http://dgmweb.net/DNA/General/SignatureMarkers.html > > > Even if a tree cannot be constructed it is usually still > > > possible to get an idea of the distribution of a surname > > > from early tax records. > > > > Yes, but at which point you're doing history, not genealogy. > > I would regard an investigation into the origins of a surname > as a valid genealogical technique. The technique is well > described in George Redmonds' book "Surnames and Genealogy: > A New Approach". Yes, investigating the origin of a surname is part of genealogy, and so is family history, but neither has anything to do with the family's genetics. My experience has been that most surnames have multiple origins, so even knowing the origins of all of them doesn't connect someone to them biologically. There are really two different kinds of studies here: one is the Y-DNA surname project and the other is sometimes called a "one-name study." The latter compiles everything relative to the name. The former is a tool to aid the latter. I'm not doing "one-name studies," I'm running Y-DNA surname projects, which are only concerned with validating/debunking/advancing paper genealogy via genetics. Family history, sociology, anthropology, etc. are all beyond the scope of Y-DNA STR testing. > As far as English records are concerned there are numerous > records available from the 1300s onwards. I am fortunate > that with the Cruwys surname the records have been held in > one family in the same location for over 900 years and it > is possible to construct a genealogical tree with a > reasonable degree of confidence back to the 1200s. This > won't be possible in most cases, but it's still possible > to get a good idea of the frequency and distribution of > a surname by looking at early records. There may be "numerous records" that far back, but I submit most people will find their surname does not get back that far. And, again, "reasonable degree of confidence" is not good enough. Before I even began my Y-DNA project, I already knew my mother's surname, STRAUB, was German and that most STRAUBs originate in southern Germany or adjacent Switzerland and Alsace-Lorraine. I wanted to know *which* STRAUB family. My STRAUB project has already turned up over 20 different genetic origins of the surname. We have, in fact, "crossed the pond" with my line, to Großgartach, Württemberg, but so far, the paper records appear to run out in the early 1600s, so that's where I stop. > > Forgive me, but I doubt that very many project admins > > have plans to publish books on their projects. I > > certainly don't. Our project web sites constitute > > "publication," and they have a huge advantage over > > paper publication in that they can be constantly > > updated. > > It is entirely an individual choice, but papers published > in journals or written up in books can be cited by other > researchers. Results published on websites can't be cited > in the same way except as personal communications and are > less reliable because of the lack of third-party review. There is hardly anything we do as surname project admins that warrants publication in a refereed journal or in hard copy. Because our research is on-going, it makes far more sense to publish on the web where our results can be constantly updated -- and are widely available. The Internet frees us from having to track down hard copy sources, which anyone who has done much hard copy sourcing can only welcome with open arms. > > I see no fundamental difference in how members should be > > grouped in small projects or large projects. A match is > > or is not a match. Large projects are simply a great deal > > more work -- and I would say, too much for one person > > to manage well. > > The administrator of a large DNA project cannot possibly > have time to be intimately involved with the individual > genealogies of his or her project members and therefore > has to take any submitted pedigrees on trust. And I submit that is a problem because the project admin should be intimately knowledgeable about the genealogy of their project surname and should be checking the lineages of their members. That is the reason I say projects for common surnames need to be broken up by haplogroup with the larger subgroups needing multiple co-admins, and that is why people shouldn't tackle more than one project, unless the surname is uncommon. IMO, we are not just taking tickets at the door and passively letting the members fend for themselves. We need to be deeply engaged in our projects. > I agree that in an ideal world large projects should have > multiple admins, but willing and qualified volunteers are > not always available. Once someone has taken on a surname, everyone assumes it is "taken." You don't know how many "takers" you would get if you would agree to split up your project by haplogroup. Many people who don't like the idea of being a "co-admin," would jump at the chance to run a subset of the project by themselves. > Results are not always straightforward and do not always > provide a simple yes/no answer. If the results are backed > up with reliable genealogical data then there is no > problem. The further back in time the more difficult it > gets. The issue here is not what you do when the answer is a clear, yes or no, the issue is what you do with people who do not give a clear, yes or no. My point was that it's better to leave people unassigned than assign them to what turns out to be the wrong group. > As an example, take a look at my Cruwys group 1 and in > particular kit no. 130860 and his relationship to the > other men in this group: > > http://www.familytreedna.com/public/CruwysDNA/default.aspx?section=yre sults > > On 37 markers kit no. 130860 doesn't look as though he > is related to the rest of the group. At 67 markers it > looks more likely. That is the reason I harp on all R1b's to test 67 markers. In fact, I nag everyone to test 67 markers. > I think in this case a Cruise line went from Devon to > Ireland shortly after the Anglo-Norman invasion of > Ireland in 1169. The Devon line can be traced forwards > to the present day. The Cruise/Cruys surname is found > in very early Irish records but the line of the Irish > project member cannot be traced back before the > 1800s. The TiP result puts the match within the bounds > of probability, and logically this is the only explanation > I can find for the closeness of the results, but it is > not a cast-iron case or a simple yes-no answer. I totally agree this is a tough case. There is a possibility they connect within the period of surname adoption, but there is also the possibility the match is coincidental. #130860 has a GD of 10 from the group's modal haplotype, but if you consider him, the Irish one, and the English ones to be equidistant from a hypothetical common ancestor (an "in-betweener") then the GD becomes a reasonable 5. But I would want to test cousins, especially of #130860, until I had the in-betweeners. For the moment, I would remove #130860 from the group because, IMO, that is what the DNA test results say to do. You can always say in the discussion that they may connect. > > I see no inherent reason for the match rate to be > > correlated with the frequency of the surname. The > > number of descendants per progenitor is probably > > random. What makes a surname common is almost certainly > > correlated with the number of times it's been adopted, > > rather than an increased number of descendants per > > progenitor. In the case of "displaced populations" > > (as in the U.S., Australia, etc.), there is likely > > to be a correlation with how early the progenitor > > arrived and how many progenitors of that surname > > arrived. > > No one knows the answers to these questions yet. And my point would be that the answer is not worth knowing. > Anecdotal evidence suggests that English lines that > emigrated to countries such as the USA and Australia > multiply at a much greater rate than those lines > which stayed behind. Yes, of course they did. They entered a new land with abundant resources. The population exploded. The good land in Europe was taken. Land passed on to the eldest son because it could no longer be split up between all the sons and still provide a living. Sons of farmers who didn't inherit land had the option of joining the army or the priesthood -- or adopting a trade if their father could afford the apprenticeship. In the new world, people pushed further into the frontier with every generation. Nearly every son married and had a family -- a big family, with the majority of children surviving. Their is no inherent difference in the reproductive rate of Europeans and European-Americans. Any perceived difference is the result of environmental factors, in the same way an introduced species can have its population explode in a new land. If Europe were empty, an introduced human population would explode -- as happened when the European continental glaciers receded. > > I really can't see many surname project admins publishing > > their studies. Why? The projects are ongoing, so even > > if you published, the publication would soon be out of > > date. The beauty of the web is that your "publication" > > (your web site) can be current. > > It is standard scientific practice to publish results of > ongoing research. Our DNA projects are effectively > scientific research projects. If our research is to be > recognised it needs to be published. I suspect few > project admins will ever publish their results but > there's no harm in encouraging them to do so. I'm sorry, but scientific journals do not publish "progress reports." Page costs are too high and the review process to intensive to expend these resources on progress reports. Progress reports are most often published in the annual reports of granting agencies or as the abstracts of scientific meetings. Even if your research is ongoing, you will need to have accomplished (finished) some aspect of the research to get it published in a refereed journal. As for what we're doing being "science"... Genealogy is fundamentally history. We are using a scientific tool, but as genealogists, we are not "doing" science. A "progress report" from a surname project may get published in a family association newsletter, and a topic applicable to genetic genealogy in general may get published in a journal specific to genetic genealogy, but you are not going to see scientific journals publishing progress reports on our surname projects. The web is a far better medium for surname projects to publish their ever-expanding results. > > The match rate can vary based on *many* factors, which > > is the reason "match rate" isn't a statistic worth > > gathering, IMO. Even if you knew the match rate for > > every name in every project, what use is knowing > > it? Just because you can run a statistic on a set > > of numbers doesn't mean it tells you anything worth > > knowing. > > This is the sort of statistic that is well worth > knowing and it is why papers such as James Irvine's > are of particular value. If comparative statistics > are available new project admins will have a baseline > to work from, and will have some idea what to expect > as their project grows. "Match Rate" is not an inherent quality. It is the accumulation of a multitude of independent factors, like the weather. There is no inherent "baseline" that you would expect all projects to approach, any more than there is such a thing as "normal" weather. One of my projects has a high match rate because a single family dominates the project: http://www.familytreedna.com/public/carrico/default.aspx?section=yresu lts Another project has a high match rate because we've actively sought and funded members to confirm results: http://www.familytreedna.com/public/corbin/default.aspx?section=yresul ts Another project has a relatively low match rate, which would be lower still if one family did not dominate the project: http://www.familytreedna.com/public/straub/default.aspx?section=yresul ts A far more relevant statistic is the number of individuals with no matches. My major prof's doctoral thesis turned up a meaningful sampling statistic: that until your least common taxa (subgroups) are represented by at least three specimens (individuals), the probability is that there are still taxa out there that you have missed. In other words... The number of unmatched individuals is more significant than the percentage of matched ones because the former are the better indicator of sampling depth. > > I don't see how that mistake could be made. If the > > paper trail is wrong, the DNA test results will tell > > you, loud and clear. Probably the greatest strength > > of DNA testing is its ability to reveal bad paper > > genealogy. I would certainly never allow paper > > genealogy trump DNA evidence when it came to > > grouping members. > > It all depends on how many results you have and how > reliable the paper trails are. There are certainly cases > of people claiming a higher than normal mutation rate > to try and fit DNA results to a dodgy pedigree. Which is the reason I just said you don't allow paper genealogy to trump DNA test results. > The men will no doubt all share a common ancestor > but it might not be the person that they thought > it was. As I'm certain we all know, it is not possible to pinpoint a specific ancestor based on Y-DNA test results alone -- you could as easily descend from a male relative of the alleged ancestor. > > I'm afraid that statement is taken out of context. > > If my meaning wasn't clear, I apologize because, > > in context, what if said was: Once an American has > > connected to their immigrant, both on paper and via > > DNA test results, *then* the major goal is crossing > > the pond. > > That's fair enough, but I'm looking at this from a > different perspective. I get someone in the UK who > might be interested in DNA testing and when I look > at the relevant project website there are links to > the xxx surname society of America, discussions > about the distribution of the surname in America, > lots of baffling abbreviations for American state > names and desperate statements about how they hope > to make connections across the Pond. I'm sorry, but none of the six FTDNA projects I admin links to a family association. I am the DNA admin for a family association, but for a surname that is not one of the projects I admin. As for the "baffling" abbreviations, their meaning can be found on a multitude of web sites via a simply Google search, for example: http://en.wikipedia.org/wiki/List_of_U.S._state_abbreviations By the way, I use the Chapman Codes for locations in the British Isles: http://www.genuki.org.uk/big/Regions/Codes.html For example: http://dgmweb.net/FGS/S/ShermanSamuel-PhillipaWard.html I do give my site visitors credit for being able to find out what the codes mean, even if they don't already know them. > It's very hard to motivate someone to take a test when > the project doesn't even acknowledge the existence of > the surname in their own country and they can't see how > they can benefit by testing. I'm sorry, but it's absurd to suggest Americans don't acknowledge the existence of their surnames in Europe. Why else would we be trying to "cross the pond"? And I give European genealogists the same credit I give American genealogists for having the intelligence to see how Y-DNA testing could help them. > This does not apply to all projects, but it does to > quite a few, including some projects for the most > common surnames. I'll be the first to grant that large projects have large problems, but they don't stem from an anti-European bias on the part of the admins. > One of the very common surnames is split into three > different projects and the so-called worldwide project > has a list of US states followed at the end by "all > points abroad", with no acknowledgement that this > surname existed for several hundred years in England > before it even arrived in America. It does not > exactly encourage anyone from outside the US > to take a test with this project. I agree that listing all 50 states, individually, is absurd, especially when all other nations are lumped as "points abroad." Why not just say, "worldwide"? That covers all bases equally. However, my suggestion would be to complain to the project admin about it, not hold that project against the rest of us. I do think it's absurd to expect an "acknowledgement" that a surname existed in a particular country before coming to the U.S. Except for Amerinds, *all* surnames in the U.S. existed somewhere else before coming here. This comes under the heading of "it goes without saying..." I am running a Y-DNA surname project, not a "one-name study." Y-DNA testing is a tool, and I'm offering to help genealogists use that tool effectively. It is not my job to supply the history of the surname. I presume genealogists of that surname already know it. > > I have spent hundreds of dollars subsidizing the > > tests of Europeans for my projects, and I'm > > offering hundreds more, so I think it's an > > unfair criticism of "American" project admins > > that they don't care or aren't trying to > > recruit Europeans. > > I'm not criticising all projects. Your presentation > is good and is appealing to non-US testees but this > is sadly not the case for many projects, and is a > significant barrier to encouraging more > non-Americans to test. Yes, of course, some projects are run better than others, but a poorly run project is as much of a deterrent to American genealogists as it is to European genealogists. > > At this early stage in the project, the most > > burning questions have to do with the > > relationships between the U.S. immigrants, > > that is, between the American progenitors > > and their origins in Europe. DNA testing is > > ideally suited to answering these questions. > > DNA is indeed ideally suited to answering these > questions but I would suggest that you do not put > this wording on your project websites. This > might be a burning question for people with the > surname in the US, but it is irrelevant for > anyone with the surname in any other country. > The project presentation needs to be neutral > and not written from the perspective of the > people in just one emigrant-receiving country. The content on my web sites is driven by the questions I'm asked because, speaking selfishly, I get tired of answering the same questions over and over. If there were Europeans asking me questions, I would have more text on my web site relating directly to Europeans. But the fact is nearly all of the text on my web site does apply to everyone in the world, not to Americans only. None of my surname projects is in any way limited geographically. Please read the Goals of my projects: http://www.familytreedna.com/public/biddle/default.aspx?section=goals http://www.familytreedna.com/public/carrico/default.aspx?section=goals http://www.familytreedna.com/public/corbin/default.aspx?section=goals None of the above even hints that there is an emphasis on Americans in the project. As for the other two: http://www.familytreedna.com/public/straub/default.aspx?section=goals http://www.familytreedna.com/public/rasey/default.aspx?section=goals while they suggest specific American problems, they clearly encompass all bearers of the surname, worldwide. It seems to me you may have a gripe against a particular project, but I feel it's unfair of you to paint all projects with that brush. Ultimately, it's Europeans who are making the decisions as to whether or not they get tested. FTDNA is on the web, the web is accessible around the world, and I see nothing on the FTDNA web site to suggest that non-Americans are in any way unwelcome. My perceptions are that, by and large, Europeans know where they came from, so don't need to test, or their paper genealogy is of long standing, and they don't want to risk having it undone with a DNA test -- which is a reason some Americans don't get tested, either. The major phenomenon going on here is the number of Americans newly adopting the hobby, thanks largely to the Internet. As I see it, they especially are the ones embracing Y-DNA testing, and as all of us are "transplants," sooner or later we all gravitate to the question of crossing the pond. The population of the British Isles is currently estimated at about 65 million, the United States is over 308 million, and Australia is about 22 million. Based on just population, alone, you would expect to find five times as many Americans and Australians being tested as Europeans. And as Europeans don't need to cross the pond, they have much less incentive to test than we do. So far, every "foreign" individual tested for my projects has had their testing paid for, and in very few cases did they express any interest in the outcome. They were simply doing the project a favor, one I gratefully acknowledge, but the fact is most had no interest in their genealogy, much less in being tested. I don't consider that fact my fault. Diana

    02/03/2011 01:47:33
    1. Re: [Y-DNA-projects] Y-DNA-PROJECTS Making matches, crossing the Pond
    2. Debbie Kennett
    3. Diana I agree that we are all free to run our projects in the way that we see best. I think it is always helpful to get perspectives from as many project administrators as possible. I've replied to some of your specific comments below. >Actually, your example proves just the opposite. The reason you are >taking a second look at a match with this low a probability (and 74% >is *very* low) is that their paper genealogy connects them. If these >men had different surnames (did not remotely connect on paper), you >would not give their 74% chance of a connection in 24 generations a >second look. Personally, I would be withholding judgment, entirely, >until both tested 67 markers. Regarding this example from the Aldous project, a 74% probability is not very low. Look at it another way - there is only a 26% chance that the two men are not related within this number of generations. Ideally we would want more samples from the same tree and both men should be tested at 67 markers to provide higher resolution. I'm only indirectly involved in the project. The surname makes no difference to the result. The surname is only important because the genealogy research is much easier if you are focusing on just one surname. It would be impossible to make the genealogical connection this far back if the two surnames were different. >Yes, I am discounting them. I'm not interested in "potential" >relatives that predate paper genealogy. I'm a genealogist, and I'm >using DNA testing to support/debunk/advance paper genealogy. There is >a point when genealogy can go no further, and I'm not interested in >going beyond that point. If my members are interested, they have the >right to pursue the question, themselves. But the paper genealogy can in some cases go back to well before 1600. All surname projects will eventually have matches between two different lines, often in completely different countries, where there is no hope of finding the link in the paper records. Are you saying that you would exclude such results because the paper link cannot be found? >Initially, I followed the FTDNA guidelines: >http://www.familytreedna.com/genetic-distance-markers.aspx?testtype=37 >http://www.familytreedna.com/genetic-distance-markers.aspx?testtype=67 >After years of running six projects, I have enough empirical evidence >to have my own guidelines (see last section at bottom of page): >http://dgmweb.net/DNA/y-dna-projects/TMRCA.html It would appear that what you call empirical evidence is in fact the evidence from the very limited dataset within your own projects, which is not a statistically valid sample size. Mutation rates are random events. The only way that they make sense is to average them out across large samples. This has already been done in a number of father and son studies, although the sample sizes are still too small and further research is needed. These are the best figures we have available to date. FTDNA have never published the underlying mutation rates used for their TiP tool but they do have the benefit of a very large database, and the TiP will provide a much clearer picture of the expected range than the collective results from half a dozen projects. Interestingly FTDNA do not reveal the studies on which their rule of thumb guidelines are based or the underlying mutation rates used to establish these ballpark figures. They are a very rough and ready guide but need to be used with caution. If I apply these guidelines to my two Aldous men I am told that "the odds greatly favor that you have not shared a common male ancestor with this person within thousands of years". In this scenario the TiP provides a much more reasonable assessment. I'm not quite sure that you understand probabilities. Probabilities give us a range in which an event might be expected to fall not an average or an absolute. In the example that you cite the FTDNA-TiP does not tell you that your "cousin is more closely related to A than he is to B". It is simply telling you that these are the ranges and that both results fall within the possible range. The whole point is that mutations occur at random and you cannot predict when they will occur. >Your example in the ALDOUS project is extraordinary. I'd want to >upgrade to 67 markers and test more than just two family members >before I was ready to decide what was going on here. For example, are >there any "in-betweeners"? You need to start mapping the appearance >of these mutations with a cladogram as I am doing with this CORBIN >family: >http://dgmweb.net/DNA/Corbin/NodeChart-JohnCorbin-RichmondCoVA.html Ideally we do need more samples in the Aldous project. The problem too is that English lines tend to be nowhere near as large as American lines and there is consequently a much smaller pool of people available for testing. >Most cases do give an obvious Yes-No answer, especially for Americans >connecting to their immigrants -- and, yes, those are the people I'm >largely dealing with. I've never looked at their TiP calculations, >before, but I just did, and my "families" (subgroups) are giving >probabilities in the neighborhood of 98-100% in 24 generations. We are dealing with different datasets. I have some Americans trying to connect with their emigrant ancestors, but I am mostly looking at how the documented English lines are related, and establishing how many different genetic lineages there are. >The statistical threshold in science is 95%. You won't get a good >journal to even look at your paper, much less put it through the >review process, with probabilities less than that. It's a big world, >and coincidences do happen. What do you mean by statistical threshold? Are you referring to confidence intervals? http://www.stat.yale.edu/Courses/1997-98/101/confint.htm These give an expected range within which something might occur based on the available data. 95% is usually the upper limit quoted. >I didn't say I stopped when the *individual's* paper trail runs out, I >said I stop when the *family's* paper trail runs out. If the DNA >evidence is strong enough, I will connect people to a family, even if >the paper connection remains unknown. The question then becomes the >level of acceptance. You are using a statistical probability based on >a calculation that treats mutations as happening at a constant rate. >I'm using an empirically derived threshold that allows for the wide >swing in the random occurrence of mutations. What are those >thresholds? Probabilities do not specify that mutations occur at a constant rate. They try and make some sense of the random nature of mutations and give us an expected range within which the mutation might occur. Probabilities derived from a large dataset will give better predictions than those derived from a very limited dataset. >Experience (empirical evidence) has shows that when people are >connected within 9 to 12 generations, as are my CORBINs >http://dgmweb.net/DNA/Corbin/CorbinDNA-results-HgI1.html#AS2 >my STRAUBs >http://dgmweb.net/DNA/Straub/StraubDNA-results-HgI1-AS5.html#data >and my CARRICOs >http://dgmweb.net/DNA/Carrico/CarricoDNA-results-HgJ2a4b.html#data >their genetic distances range from 0 to 3, in this rank of frequency: >1, 0, 2, 3. I am using this genetic distance in combination with the >sharing of signature markers to subgroup members: >http://dgmweb.net/DNA/General/SignatureMarkers.html As before, these are the limited results from just a few projects. A wider range of variations will be seen when results are reported from larger numbers of projects. >Yes, investigating the origin of a surname is part of genealogy, and >so is family history, but neither has anything to do with the family's >genetics. My experience has been that most surnames have multiple >origins, so even knowing the origins of all of them doesn't connect >someone to them biologically. >There are really two different kinds of studies here: one is the >Y-DNA surname project and the other is sometimes called a "one-name >study." The latter compiles everything relative to the name. The >former is a tool to aid the latter. I'm not doing "one-name studies," >I'm running Y-DNA surname projects, which are only concerned with >validating/debunking/advancing paper genealogy via genetics. Family >history, sociology, anthropology, etc. are all beyond the scope of >Y-DNA STR testing. I'm not quite sure that I follow your reasoning here. A Y-DNA surname project is the study of a surname. I think what you are saying is that you are not using your DNA projects to study surnames and you are effectively doing lineage studies rather than surname studies. Investigating the origin of a surname is a legitimate part of a Y-DNA surname project. DNA results reveal which variants of a surname are related and which are not. >There may be "numerous records" that far back, but I submit most >people will find their surname does not get back that far. And, >again, "reasonable degree of confidence" is not good enough. Before I >even began my Y-DNA project, I already knew my mother's surname, >STRAUB, was German and that most STRAUBs originate in southern Germany >or adjacent Switzerland and Alsace-Lorraine. I wanted to know *which* >STRAUB family. My STRAUB project has already turned up over 20 >different genetic origins of the surname. We have, in fact, "crossed >the pond" with my line, to Großgartach, Württemberg, but so far, the >paper records appear to run out in the early 1600s, so that's where I >stop. Only a minority of people will be able to trace their lines back much before 1600 but here are numerous early English records which testify to the existence of a surname prior to 1600 in different parts of the country. These records can be used to establish the distribution of a surname and to estimate the likely number of lineages. >There is hardly anything we do as surname project admins that warrants >publication in a refereed journal or in hard copy. Because our >research is on-going, it makes far more sense to publish on the web >where our results can be constantly updated -- and are widely >available. The Internet frees us from having to track down hard copy >sources, which anyone who has done much hard copy sourcing can only >welcome with open arms. Research can only ever be formally recognised if it is published in a third-party publication. Realistically very few project admins will do this, but there is no harm in trying to encourage people to do so. I know of one admin who is currently writing a book on his DNA project and another one who will eventually write a paper on his results. >And I submit that is a problem because the project admin should be >intimately knowledgeable about the genealogy of their project surname >and should be checking the lineages of their members. That is the >reason I say projects for common surnames need to be broken up by >haplogroup with the larger subgroups needing multiple co-admins, and >that is why people shouldn't tackle more than one project, unless the >surname is uncommon. IMO, we are not just taking tickets at the door >and passively letting the members fend for themselves. We need to be >deeply engaged in our projects. In an ideal world this would happen but the admin of a large project cannot possibly do such work on his or her own account, and unless volunteers are available he or she has to work with the available time and resources. >The issue here is not what you do when the answer is a clear, yes or >no, the issue is what you do with people who do not give a clear, yes >or no. My point was that it's better to leave people unassigned than >assign them to what turns out to be the wrong group. This is entirely a matter of choice. In some cases it makes more sense to group the results together while investigating a theory. >I totally agree this is a tough case. There is a possibility they >connect within the period of surname adoption, but there is also the >possibility the match is coincidental. #130860 has a GD of 10 from >the group's modal haplotype, but if you consider him, the Irish one, >and the English ones to be equidistant from a hypothetical common >ancestor (an "in-betweener") then the GD becomes a reasonable 5. But >I would want to test cousins, especially of #130860, until I had the >in-betweeners. For the moment, I would remove #130860 from the group >because, IMO, that is what the DNA test results say to do. You can >always say in the discussion that they may connect. In this case I've found it easier to include the Irish kit 130860 in the group. We are trying to find further Irish Cruises to test but they seem to be few and far between and the only man identified so far has failed to respond to the offer of a free test. This is an example which demonstrates the superiority of the TiP tool over the rule of thumb methodology. If I compare kit nos. 107091 and 130860 which have eight mismatches the Tip gives me an 84.06% probability that the two men are related within 24 generations (840 years at 35 years per generation or 600 years at 25 years per generation). FTDNA's rules of thumb discount all probability of a match being valid. >Their is no inherent difference in the >reproductive rate of Europeans and European-Americans. Any perceived >difference is the result of environmental factors, in the same way an >introduced species can have its population explode in a new land The reproductive rate could potentially be the same in both countries but it is the mortality rate which I suspect differs considerably with mortality rates being much higher in densely populated European cities. >I'm sorry, but scientific journals do not publish "progress reports." >Page costs are too high and the review process to intensive to expend >these resources on progress reports. Progress reports are most often >published in the annual reports of granting agencies or as the >abstracts of scientific meetings. Even if your research is ongoing, >you will need to have accomplished (finished) some aspect of the >research to get it published in a refereed journal. Scientists publish case reports or the results from small studies as they go along. >As for what we're doing being "science"... Genealogy is fundamentally >history. We are using a scientific tool, but as genealogists, we are >not "doing" science. A "progress report" from a surname project may >get published in a family association newsletter, and a topic >applicable to genetic genealogy in general may get published in a >journal specific to genetic genealogy, but you are not going to see >scientific journals publishing progress reports on our surname >projects. The web is a far better medium for surname projects to >publish their ever-expanding results. We are doing original research, and it is only ever going to be recognised as such if it is published. >"Match Rate" is not an inherent quality. It is the accumulation of a >multitude of independent factors, like the weather. There is no >inherent "baseline" that you would expect all projects to approach, >any more than there is such a thing as "normal" weather. It is still interesting to get a feel for the match rates in other projects to know what you might expect, just as it's useful to check the weather forecast before venturing outdoors to decide whether or not to take an umbrella. To get any benefit from match rates you need to define some method for establishing the depth of testing (penetration as James Irvine calls it) for a given surname. >A far more relevant statistic is the number of individuals with no >matches. My major prof's doctoral thesis turned up a meaningful >sampling statistic: that until your least common taxa (subgroups) are >represented by at least three specimens (individuals), the probability >is that there are still taxa out there that you have missed. In other >words... This also just relevant for the match rate. This is where the one-name study approach has a distinct advantage because by studying all references to a surname and reconstructing all the trees you can eventually establish how many different lineages there are and how many living bearers of the surname. >As for the "baffling" abbreviations, their meaning can be found on a >multitude of web sites via a simply Google search, for example: >http://en.wikipedia.org/wiki/List_of_U.S._state_abbreviations >By the way, I use the Chapman Codes for locations in the British >Isles: >http://www.genuki.org.uk/big/Regions/Codes.html >For example: >http://dgmweb.net/FGS/S/ShermanSamuel-PhillipaWard.html >I do give my site visitors credit for being able to find out what the >codes mean, even if they don't already know them. I believe in keeping everything as simple and clear as possible, and making the results as accessible as possible to anyone finding my project website from anywhere in the world. Some readers will be able to decipher American state names and Chapman county codes but many will not. I see no point in putting barriers in the way of people's comprehension. Very few British people would be able to decipher the Chapman county codes without reference to a guide, and they would be unfamiliar to anyone who is not a genealogist. I suspect American state abbreviations are probably familiar to most Americans, but only a few obvious ones such as NY will be familiar to the average European. >I'm sorry, but it's absurd to suggest Americans don't acknowledge the >existence of their surnames in Europe. Why else would we be trying to >"cross the pond"? And I give European genealogists the same credit I >give American genealogists for having the intelligence to see how >Y-DNA testing could help them. I'm only describing what I see on some project websites, some of which only mention the surname in America. European genealogists often do understand how Y-DNA testing can help them, but an all-American DNA project is not going to help their research. >I'll be the first to grant that large projects have large problems, >but they don't stem from an anti-European bias on the part of the >admins. There are two problems. The first is an anti-European bias in the presentation of the project. This is usually not intentional. The second problem is the disproportionately large numbers of Americans in large projects which consequently provide little incentive for Europeans to test because they are unlikely to have a match and advance their research. In these cases it does help, as you are doing, to fund DNA tests to help to redress the balance. >I agree that listing all 50 states, individually, is absurd, >especially when all other nations are lumped as "points abroad." Why >not just say, "worldwide"? That covers all bases equally. However, >my suggestion would be to complain to the project admin about it, not >hold that project against the rest of us. >I do think it's absurd to expect an "acknowledgement" that a surname >existed in a particular country before coming to the U.S. Except for >Amerinds, *all* surnames in the U.S. existed somewhere else before >coming here. This comes under the heading of "it goes without >saying..." I agree that projects should state that they are studying the name worldwide but many don't. They state that they are studying the x families of Virginia or the lineage of x who immigrated in 1600. This is all very well if they are only interested in studying the surname in America, but this sort of presentation does not help their case if they are hoping to recruit in other countries. >None of my surname projects is in any way limited geographically. >Please read the Goals of my projects: >As for the other two: >http://www.familytreedna.com/public/straub/default.aspx?section=goals >http://www.familytreedna.com/public/rasey/default.aspx?section=goals >while they suggest specific American problems, they clearly encompass >all bearers of the surname, worldwide. In general the presentation of your projects is good and is aimed at a worldwide audience. I would be inclined to remove the US-specific goals from your website, as these goals are restricted to people in just one country and will not be of interest to potential testees in Europe. If you're going to include them make them a little less prominent. >It seems to me you may have a gripe against a particular project, but >I feel it's unfair of you to paint all projects with that brush. >Ultimately, it's Europeans who are making the decisions as to whether >or not they get tested. FTDNA is on the web, the web is accessible >around the world, and I see nothing on the FTDNA web site to suggest >that non-Americans are in any way unwelcome. I'm not painting all projects with the same brush, but I do have contact with lots of Brits who might be interested in taking a DNA test and I find it frustrating that there are so many projects which have so little to offer them. I have a number of people who tested first in my Devon project and only joined the appropriate surname project afterwards. I have some people who won't even join their "designated" surname project, largely because the admin has chosen an inappropriate selection of variants and they believe their surname doesn't belong. My feeling is that Europeans have only really started to embrace the idea of DNA testing in the last couple of years or so. The Genographic Project has helped to bring in many Europeans to the FTDNA database. Over 50% of the members of my Devon DNA project reside in the British Isles, and only a small proportion of my project members actually live in America. If people have the money available they will happily take a DNA test so long as they think they will get some return for their money. There will always be people who don't want to test and I would guess that the percentages will be similar for each country. Many people in the British Isles are also interested in finding out more about their origins but in this case the pond they have to cross is the English channel. Debbie Kennett

    02/04/2011 09:38:55