RootsWeb.com Mailing Lists
Previous Page      Next Page
Total: 7340/10000
    1. Re: [DNA] My Raw Data Files - Comparison 23andme vs AncestryDNA
    2. Wjhonson via
    3. Any chance to get the code that you used to do this? -----Original Message----- From: David Schroeder via <genealogy-dna@rootsweb.com> To: genealogy-dna <genealogy-dna@rootsweb.com> Sent: Mon, Dec 14, 2015 8:50 pm Subject: Re: [DNA] My Raw Data Files - Comparison 23andme vs AncestryDNA I was able to 'fix' the no-calls for matching RSIDs on both Ancestry and 23andme when one, or the other, was not a no-call. I fixed 6,632 on 23andme and 6,708 on Ancestry. Interestingly enough, there were 3,833 that were left as no-calls on both 23andme and AncestryDNA for the same RSIDs. I am wondering if these are the result of particularly difficult locations to test, or perhaps the SNP is rare in my genome? The tests were over two years apart. I uploaded both fixed raw data files to gedmatch to see how it may affect my 'one-to-many' matches. (Will have to wait on the processing). I ran the Gedmatch File Diagnostic Utility, and the fixed files had significantly reduced my error rates. It seems that most of my errors are in the X, Y or MT Chromosomes. David ------------------------------ Message: 4 Date: Sun, 13 Dec 2015 03:31:45 -0800 From: Ann Turner <dnacousins@gmail.com> Subject: Re: [DNA] My Raw Data Files - Comparison 23andme vs AncestryDNA To: DNA Genealogy Mailing List <genealogy-dna@rootsweb.com> Message-ID: <CAA-Ub_COJUEcMV4v3aXj4hbEaj6cbFf01AT9yDSBMJwDoyTnsA@mail.gmail.com> Content-Type: text/plain; charset=UTF-8 I've always mentally thought about the "i" SNPs as "internal" catalog numbers, but I'm not positive if I made that up or actually noticed someone from 23andMe used that word :) As you probably noticed, AncestryDNA doesn't always present alleles in alphabetical order. You will find instances of TC and CT, for example. Illumina's base-calling software has something called a "top strand" and a "bottom strand" (not the same thing as forward/reverse or plus/minus). 23andMe does some post-processing to put alleles in alphabetical order. Anyway, did you also look for TA? SNPs where the alternative alleles are also complementary base pairs in the double helix ( A <-> T and C <-> G) are tricky to handle. 23andMe may have developed custom probes to identify some of those. I've also noticed that AncestryDNA and FTDNA do not report any indels (the I and D alleles you asked about). Tim, this may not be worth the effort to analyze, but I'm curious to know if the "i" variants with rs numbers at FTDNA may be cases where 23andMe put some additional probes on the chip for a particular locus. If you have a list handy, I could explore that a bit. Ann Turner Ann Turner On Sat, Dec 12, 2015 at 11:41 PM, Tim Janzen via <genealogy-dna@rootsweb.com > wrote: > Dear David, > DD means a deletion and II means an insertion. The "i" SNPs in the > 23andMe files are those that don't have rs numbers assigned to them by > 23andMe. It is possible that "i" stands for Illumina, but I am not certain > about that. It is also possible that it stands for "inserted", possibly > because 23andMe inserted these SNPs onto the SNP chip because they were of > special interest to 23andMe. Someone at 23andMe would know the answer to > this question. > It is interesting that AncestryDNA files don't have SNPs with the > allele values AT. I don't have a definite answer for that. I checked my > mom's file for the SNPs that have the allele values AT in 23andMe and found > a total of 322 of these SNPs. I then checked for these SNPs in my mom's > AncestryDNA file and I couldn't find any of those SNPs in my mom's > AncestryDNA file. My suspicion is that Ancestry.com has dropped all SNPs > from their dataset with the values AT because they think that the results > may be erroneous. > Sincerely, > Tim Janzen > > -----Original Message----- > From: genealogy-dna-bounces@rootsweb.com > [mailto:genealogy-dna-bounces@rootsweb.com] On Behalf Of David Schroeder > via > Sent: Saturday, December 12, 2015 9:33 PM > To: genealogy-dna@rootsweb.com > Subject: [DNA] My Raw Data Files - Comparison 23andme vs AncestryDNA > > I have tested at both 23andme (V3) and AncestryDNA. I have written a > program > to add the raw data file information into a MySQL database, creating > separate tables for my 23andme results and my AncestryDNA. > > I am trying to understand some things. > > I can understand all the A, C, G, T lettering. The single letters represent > SNPs on my Y and X chromosomes. I also understand that '--' is a no call. > What are 'DD' and 'II'? > > > I also found that AncestryDNA had no 'AT' SNPs for me, but 23andme had 611: > > Can anyone explain why I have no 'AT' SNP pairs in my AncestryDNA raw data > file? I verified this by browsing my Ancestry Raw data file. I had every > other SNP pair represented. > > The final question is about RSIDs. What are the ones that begin with 'i' in > my 23andme raw data file? I have 10,709 RSIDs that begin with 'i-----'. > > David > > > ------------------------------- > To unsubscribe from the list, please send an email to > GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without > the quotes in the subject and the body of the message > ------------------------------- To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    12/15/2015 04:43:36
    1. Re: [DNA] Basic ICW Questions:
    2. Jim Bartlett via
    3. Jim The ICW algorithm does not include arDNA. It does not include genealogy. It only matches names between your Match list and someone else's Match list. Jim - www.segmentology.org > On Dec 15, 2015, at 9:15 AM, Jim Leahy via <genealogy-dna@rootsweb.com> wrote: > > How does FTDNA determine the ICW status? > > In order for FTDNA to calculate the ICW status of an individual there must be an algorithm to evaluate the at-DNA results for that individual against some criteria based on the at-DNA results from the base pair. Has anybody been successful in back-engineering or divining the logic used for this comparison? > > A related question; what does the ICW "X status" actually tell us? > > I have seen a statement implying that it means that these individuals are "blood relatives". That would be great but seems a little too far reaching. > > Puzzled! > > Jim > > ssage

    12/15/2015 04:00:49
    1. Re: [DNA] My Raw Data Files - Comparison 23andme vs AncestryDNA
    2. David Schroeder via
    3. Thanks Andreas, I installed perl and MySQL on my Windows. I wrote a couple of perl scripts to extract raw data and add it to MySQL tables- one for 23andme, the other for AncestryDNA. I ran an 'update' SQL on each one. Next I have a perl script to extract the modified data from the database and put them back in the original raw data format. I zipped the modified raw data files and uploaded to gedmatch. I am not sure what the total impact of all this will be. I am hoping for more accurate one-to-many. Tables names: anc for AncestryDNA data; 23andme for 23andme data. Syntax of update SQL: Updates 23andme changing '--' in Pair to the value in AncestryDNA (fixed 6632): UPDATE 23andme INNER JOIN anc ON 23andme.RS = anc.RS SET 23andme.PAIR = anc.PAIR WHERE ( 23andme.RS = anc.RS) and 23andme.PAIR = '--'; Updates AncestryDNA changing '-- in PAIR to the value in 23andme (Fixed 6708): UPDATE anc INNER JOIN 23andme ON anc.RS = 23andme.RS SET anc.PAIR = 23andme.PAIR WHERE ( anc.RS = 23andme.RS) and anc.PAIR = '--'; It is pretty trivial once doing the hard work of making sure it all works. I would be happy to share with anyone who wants this. David -----Original Message----- From: Andreas West [mailto:ahnen@awest.de] Sent: Tuesday, December 15, 2015 2:07 AM To: David Schroeder; genealogy-dna@rootsweb.com Subject: Re: [DNA] My Raw Data Files - Comparison 23andme vs AncestryDNA That's very interesting and I thought about such myself (especially with now having tested at all 3 companies over almost 3 years). How did you "clean up" your no-calls, did you manually go through it? Or is that part of your program you wrote? Great post, David! Andreas > On 15 Dec 2015, at 11:47, David Schroeder via <genealogy-dna@rootsweb.com> wrote: > > I was able to 'fix' the no-calls for matching RSIDs on both Ancestry > and 23andme when one, or the other, was not a no-call. I fixed 6,632 > on 23andme and 6,708 on Ancestry. > > Interestingly enough, there were 3,833 that were left as no-calls on > both 23andme and AncestryDNA for the same RSIDs. I am wondering if > these are the result of particularly difficult locations to test, or > perhaps the SNP is rare in my genome? The tests were over two years apart. > > I uploaded both fixed raw data files to gedmatch to see how it may > affect my 'one-to-many' matches. (Will have to wait on the > processing). I ran the Gedmatch File Diagnostic Utility, and the fixed > files had significantly reduced my error rates. It seems that most of > my errors are in the X, Y or MT Chromosomes. > > David > > ------------------------------ > > Message: 4 > Date: Sun, 13 Dec 2015 03:31:45 -0800 > From: Ann Turner <dnacousins@gmail.com> > Subject: Re: [DNA] My Raw Data Files - Comparison 23andme vs > AncestryDNA > To: DNA Genealogy Mailing List <genealogy-dna@rootsweb.com> > Message-ID: > > <CAA-Ub_COJUEcMV4v3aXj4hbEaj6cbFf01AT9yDSBMJwDoyTnsA@mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > I've always mentally thought about the "i" SNPs as "internal" catalog > numbers, but I'm not positive if I made that up or actually noticed > someone from 23andMe used that word :) > > As you probably noticed, AncestryDNA doesn't always present alleles in > alphabetical order. You will find instances of TC and CT, for example. > Illumina's base-calling software has something called a "top strand" > and a "bottom strand" (not the same thing as forward/reverse or plus/minus). > 23andMe does some post-processing to put alleles in alphabetical order. > Anyway, did you also look for TA? > > SNPs where the alternative alleles are also complementary base pairs > in the double helix ( A <-> T and C <-> G) are tricky to handle. > 23andMe may have developed custom probes to identify some of those. > > I've also noticed that AncestryDNA and FTDNA do not report any indels > (the I and D alleles you asked about). > > Tim, this may not be worth the effort to analyze, but I'm curious to > know if the "i" variants with rs numbers at FTDNA may be cases where > 23andMe put some additional probes on the chip for a particular locus. > If you have a list handy, I could explore that a bit. > > Ann Turner > > > > > > Ann Turner > > On Sat, Dec 12, 2015 at 11:41 PM, Tim Janzen via > <genealogy-dna@rootsweb.com >> wrote: > >> Dear David, >> DD means a deletion and II means an insertion. The "i" SNPs >> in > the >> 23andMe files are those that don't have rs numbers assigned to them >> by 23andMe. It is possible that "i" stands for Illumina, but I am >> not > certain >> about that. It is also possible that it stands for "inserted", >> possibly because 23andMe inserted these SNPs onto the SNP chip >> because they were of special interest to 23andMe. Someone at 23andMe >> would know the answer to this question. >> It is interesting that AncestryDNA files don't have SNPs with >> the allele values AT. I don't have a definite answer for that. I >> checked my mom's file for the SNPs that have the allele values AT in >> 23andMe and > found >> a total of 322 of these SNPs. I then checked for these SNPs in my >> mom's AncestryDNA file and I couldn't find any of those SNPs in my >> mom's AncestryDNA file. My suspicion is that Ancestry.com has >> dropped all SNPs from their dataset with the values AT because they >> think that the results may be erroneous. >> Sincerely, >> Tim Janzen >> >> -----Original Message----- >> From: genealogy-dna-bounces@rootsweb.com >> [mailto:genealogy-dna-bounces@rootsweb.com] On Behalf Of David >> Schroeder via >> Sent: Saturday, December 12, 2015 9:33 PM >> To: genealogy-dna@rootsweb.com >> Subject: [DNA] My Raw Data Files - Comparison 23andme vs AncestryDNA >> >> I have tested at both 23andme (V3) and AncestryDNA. I have written a >> program to add the raw data file information into a MySQL database, >> creating separate tables for my 23andme results and my AncestryDNA. >> >> I am trying to understand some things. >> >> I can understand all the A, C, G, T lettering. The single letters > represent >> SNPs on my Y and X chromosomes. I also understand that '--' is a no call. >> What are 'DD' and 'II'? >> >> >> I also found that AncestryDNA had no 'AT' SNPs for me, but 23andme >> had > 611: >> >> Can anyone explain why I have no 'AT' SNP pairs in my AncestryDNA raw >> data file? I verified this by browsing my Ancestry Raw data file. I >> had every other SNP pair represented. >> >> The final question is about RSIDs. What are the ones that begin with 'i' > in >> my 23andme raw data file? I have 10,709 RSIDs that begin with 'i-----'. >> >> David >> >> >> ------------------------------- >> To unsubscribe from the list, please send an email to >> GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' >> without the quotes in the subject and the body of the message >> > > > > ------------------------------- > To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    12/15/2015 03:23:35
    1. Re: [DNA] Kudos to Gedmatch (was: Are FTDNA's 1cM matches shown in ICW as well?)
    2. Andreas West via
    3. Thanks Ann for clarifying this. Then FTDNA should seriously consider changing their privacy. I mean a VC backed unicorn like 23andme has a lot more to lose than FTDNA. For sure that was also checked by many lawyers. So 23andme paid the best lawyers already, just do the same statement and you're fine. It would make FTDNA so much more useful. Andreas > On 14 Dec 2015, at 22:15, Ann Turner via <genealogy-dna@rootsweb.com> wrote: > > FTDNA has cited privacy concerns over the issue of comparing kits to each > other. Your consent agreement covers only people who match you. However, if > you use a two-step process, first identifying segments where you match > multiple people, and second loading those people into the ICW matrix, I > think that would be a pretty reliable substitute. Comments, Jim? One caveat > is that everyone must reach that 20 cM threshold. I've definitely seen > cases where someone fails that test but actually is ICW with other people > in the matrix. > > Note that 23andMe's consent for sharing explicitly mentions that the person > you're sharing with will be able to see how you compare with other people > he's sharing with. If people opt in to Open Sharing, that will expand the > pool of possible comparisons. > > Ann Turner > > On Mon, Dec 14, 2015 at 5:28 AM, Jim Bartlett via < > genealogy-dna@rootsweb.com> wrote: > >> 23andMe (and GEDmatch) let us compare two Matches to complete the >> Triangulation process. >> >> FTDNA could offer a feature to just confirm which ICW Matches (which the >> already list) are on the same segment. This would require some additional >> computing, but not so much as you calculate. >> >> Jim - www.segmentology.org >> >>>> On Dec 14, 2015, at 12:59 AM, Andreas West via < >>> genealogy-dna@rootsweb.com> wrote: >>> >>> I think we should all be glad for the work that the guys at Gedmatch >> have done to allow us to run triangulations against such a large number of >> matches. >>> >>> If triangulation would be as easy as some mistakenly understand it >> (those that believe ICW is triangulation) then more websites would support >> us in doing so and I'm pretty sure FTDNA would provide us with a proper >> triangulation tool (again, it would be nice if they would be more clear >> that their ICW tool isn't triangulation but hey, it's all about marketing >> and keeping the impression you do). >>> >>> It's by no means a small feature and requires an enormous amount of disc >> space and computing power to run all possible combinations. >>> >>> Just to illustrate, with 1,000,000 customer (both 23andme and Ancestry >> have more now) you have >>> >>> 499,999,500,000 >>> >>> possible combinations to check for. >>> >>> Given you have to check for atDNA and X-DNA (two different >> functions/procedures most likely given they are based on different minimum >> matching criteria) that means those two companies would have to run >> currently more than 1 quadrillion (had to look that name up) of checks if >> they would automate triangulation (if they start from scratch, obviously >> they have done a lot of triangulations already). Hence we're only provided >> with a list of matches and it's up to us to find out which ones triangulate >> and which not. >>> >>> For matching they probably have some optimization along the way but >> still we're talking about comparing 800k+ SNP's per customer (again with >> optimization steps in between like described in Ancestry's algorithm) to >> identify who's matching who and leave it then to the customer to decide >> when to do triangulation of up to 5 people vs one other person (in the case >> of 23andme). >>> >>> I hope that people now understand what a huge task that is and why only >> very few are undertaking the task of identifying matches (with Gedmatch >> being the only company run by just a couple of people), let alone run >> triangulations for you. >>> >>> Andreas >>> >>>> On 14 Dec 2015, at 03:36, Tim Janzen via <genealogy-dna@rootsweb.com> >> wrote: >>>> >>>> Dear Andreas, >>>> I agree with you that the ADSA tool documentation statement is >> misleading. I think it should be rewritten to more accurately state what >> the ICW data does reflect. >>>> Sincerely, >>>> Tim >>>> >>>> -----Original Message----- >>>> From: Andreas West [mailto:ahnen@awest.de] >>>> Sent: Sunday, December 13, 2015 1:39 AM >>>> To: Tim Janzen; genealogy-dna@rootsweb.com >>>> Subject: Re: [DNA] Are FTDNA's 1cM matches shown in ICW as well? >>>> >>>> Thanks Tim and yes, I have two phased sets for each of my parents at >> GEDmatch. >>>> >>>> Let us all be thankful for them providing the tools that all those >> multi-million (or even billion) companies are not giving us. >>>> >>>> Also good to hear that you agree with my points. >>>> >>>> Do I understand your "disagree" with the ADSA tool documentation (the >> statement I posted) as that you agree with me that it's misleading and >> can't be done with FTDNA data? >>>> >>>> Andreas >>>> >>>> >>>> >>>> ------------------------------- >>>> To unsubscribe from the list, please send an email to >> GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without >> the quotes in the subject and the body of the message >>> >>> >>> ------------------------------- >>> To unsubscribe from the list, please send an email to >> GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without >> the quotes in the subject and the body of the message >> >> >> ------------------------------- >> To unsubscribe from the list, please send an email to >> GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without >> the quotes in the subject and the body of the message > > ------------------------------- > To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    12/15/2015 02:28:59
    1. [DNA] Basic ICW Questions:
    2. Jim Leahy via
    3. How does FTDNA determine the ICW status? In order for FTDNA to calculate the ICW status of an individual there must be an algorithm to evaluate the at-DNA results for that individual against some criteria based on the at-DNA results from the base pair. Has anybody been successful in back-engineering or divining the logic used for this comparison? A related question; what does the ICW "X status" actually tell us? I have seen a statement implying that it means that these individuals are "blood relatives". That would be great but seems a little too far reaching. Puzzled! Jim --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

    12/15/2015 02:15:22
    1. Re: [DNA] My Raw Data Files - Comparison 23andme vs AncestryDNA
    2. Ann Turner via
    3. Felix Immanuel has written a utility to perform a similar function. I have not used it myself, but I have noted difficulties with some of his other utilities. I'd be interested in a comparison of your results with his. http://www.y-str.org/2013/08/dna-error-fix.html A priori, I would not expect a huge impact on the one-to-many match function. The additional SNPs with data would help increase the tally for the SNP threshold, but any effect would be scattered over the whole genome. You may already have done this, but I'd recommend designating some of your "duplicate" kits as research. That way you can do your own experiments but not show up multiple times in the list of your matches. I'm looking forward to hearing your conclusions. Ann Turner On Mon, Dec 14, 2015 at 8:47 PM, David Schroeder via < genealogy-dna@rootsweb.com> wrote: > I was able to 'fix' the no-calls for matching RSIDs on both Ancestry and > 23andme when one, or the other, was not a no-call. I fixed 6,632 on 23andme > and 6,708 on Ancestry. > > Interestingly enough, there were 3,833 that were left as no-calls on both > 23andme and AncestryDNA for the same RSIDs. I am wondering if these are the > result of particularly difficult locations to test, or perhaps the SNP is > rare in my genome? The tests were over two years apart. > > I uploaded both fixed raw data files to gedmatch to see how it may affect > my > 'one-to-many' matches. (Will have to wait on the processing). I ran the > Gedmatch File Diagnostic Utility, and the fixed files had significantly > reduced my error rates. It seems that most of my errors are in the X, Y or > MT Chromosomes. > > David > > ------------------------------ > > Message: 4 > Date: Sun, 13 Dec 2015 03:31:45 -0800 > From: Ann Turner <dnacousins@gmail.com> > Subject: Re: [DNA] My Raw Data Files - Comparison 23andme vs > AncestryDNA > To: DNA Genealogy Mailing List <genealogy-dna@rootsweb.com> > Message-ID: > < > CAA-Ub_COJUEcMV4v3aXj4hbEaj6cbFf01AT9yDSBMJwDoyTnsA@mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > I've always mentally thought about the "i" SNPs as "internal" catalog > numbers, but I'm not positive if I made that up or actually noticed someone > from 23andMe used that word :) > > As you probably noticed, AncestryDNA doesn't always present alleles in > alphabetical order. You will find instances of TC and CT, for example. > Illumina's base-calling software has something called a "top strand" and a > "bottom strand" (not the same thing as forward/reverse or plus/minus). > 23andMe does some post-processing to put alleles in alphabetical order. > Anyway, did you also look for TA? > > SNPs where the alternative alleles are also complementary base pairs in the > double helix ( A <-> T and C <-> G) are tricky to handle. 23andMe may have > developed custom probes to identify some of those. > > I've also noticed that AncestryDNA and FTDNA do not report any indels (the > I and D alleles you asked about). > > Tim, this may not be worth the effort to analyze, but I'm curious to know > if the "i" variants with rs numbers at FTDNA may be cases where 23andMe put > some additional probes on the chip for a particular locus. If you have a > list handy, I could explore that a bit. > > Ann Turner > > > > > > Ann Turner > > On Sat, Dec 12, 2015 at 11:41 PM, Tim Janzen via < > genealogy-dna@rootsweb.com > > wrote: > > > Dear David, > > DD means a deletion and II means an insertion. The "i" SNPs in > the > > 23andMe files are those that don't have rs numbers assigned to them by > > 23andMe. It is possible that "i" stands for Illumina, but I am not > certain > > about that. It is also possible that it stands for "inserted", possibly > > because 23andMe inserted these SNPs onto the SNP chip because they were > of > > special interest to 23andMe. Someone at 23andMe would know the answer to > > this question. > > It is interesting that AncestryDNA files don't have SNPs with the > > allele values AT. I don't have a definite answer for that. I checked my > > mom's file for the SNPs that have the allele values AT in 23andMe and > found > > a total of 322 of these SNPs. I then checked for these SNPs in my mom's > > AncestryDNA file and I couldn't find any of those SNPs in my mom's > > AncestryDNA file. My suspicion is that Ancestry.com has dropped all SNPs > > from their dataset with the values AT because they think that the results > > may be erroneous. > > Sincerely, > > Tim Janzen > > > > -----Original Message----- > > From: genealogy-dna-bounces@rootsweb.com > > [mailto:genealogy-dna-bounces@rootsweb.com] On Behalf Of David Schroeder > > via > > Sent: Saturday, December 12, 2015 9:33 PM > > To: genealogy-dna@rootsweb.com > > Subject: [DNA] My Raw Data Files - Comparison 23andme vs AncestryDNA > > > > I have tested at both 23andme (V3) and AncestryDNA. I have written a > > program > > to add the raw data file information into a MySQL database, creating > > separate tables for my 23andme results and my AncestryDNA. > > > > I am trying to understand some things. > > > > I can understand all the A, C, G, T lettering. The single letters > represent > > SNPs on my Y and X chromosomes. I also understand that '--' is a no call. > > What are 'DD' and 'II'? > > > > > > I also found that AncestryDNA had no 'AT' SNPs for me, but 23andme had > 611: > > > > Can anyone explain why I have no 'AT' SNP pairs in my AncestryDNA raw > data > > file? I verified this by browsing my Ancestry Raw data file. I had every > > other SNP pair represented. > > > > The final question is about RSIDs. What are the ones that begin with 'i' > in > > my 23andme raw data file? I have 10,709 RSIDs that begin with 'i-----'. > > > > David > > > > > > ------------------------------- > > To unsubscribe from the list, please send an email to > > GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without > > the quotes in the subject and the body of the message > > > > > > ------------------------------- > To unsubscribe from the list, please send an email to > GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without > the quotes in the subject and the body of the message >

    12/15/2015 02:14:20
    1. Re: [DNA] My Raw Data Files - Comparison 23andme vs AncestryDNA
    2. David Schroeder via
    3. I was able to 'fix' the no-calls for matching RSIDs on both Ancestry and 23andme when one, or the other, was not a no-call. I fixed 6,632 on 23andme and 6,708 on Ancestry. Interestingly enough, there were 3,833 that were left as no-calls on both 23andme and AncestryDNA for the same RSIDs. I am wondering if these are the result of particularly difficult locations to test, or perhaps the SNP is rare in my genome? The tests were over two years apart. I uploaded both fixed raw data files to gedmatch to see how it may affect my 'one-to-many' matches. (Will have to wait on the processing). I ran the Gedmatch File Diagnostic Utility, and the fixed files had significantly reduced my error rates. It seems that most of my errors are in the X, Y or MT Chromosomes. David ------------------------------ Message: 4 Date: Sun, 13 Dec 2015 03:31:45 -0800 From: Ann Turner <dnacousins@gmail.com> Subject: Re: [DNA] My Raw Data Files - Comparison 23andme vs AncestryDNA To: DNA Genealogy Mailing List <genealogy-dna@rootsweb.com> Message-ID: <CAA-Ub_COJUEcMV4v3aXj4hbEaj6cbFf01AT9yDSBMJwDoyTnsA@mail.gmail.com> Content-Type: text/plain; charset=UTF-8 I've always mentally thought about the "i" SNPs as "internal" catalog numbers, but I'm not positive if I made that up or actually noticed someone from 23andMe used that word :) As you probably noticed, AncestryDNA doesn't always present alleles in alphabetical order. You will find instances of TC and CT, for example. Illumina's base-calling software has something called a "top strand" and a "bottom strand" (not the same thing as forward/reverse or plus/minus). 23andMe does some post-processing to put alleles in alphabetical order. Anyway, did you also look for TA? SNPs where the alternative alleles are also complementary base pairs in the double helix ( A <-> T and C <-> G) are tricky to handle. 23andMe may have developed custom probes to identify some of those. I've also noticed that AncestryDNA and FTDNA do not report any indels (the I and D alleles you asked about). Tim, this may not be worth the effort to analyze, but I'm curious to know if the "i" variants with rs numbers at FTDNA may be cases where 23andMe put some additional probes on the chip for a particular locus. If you have a list handy, I could explore that a bit. Ann Turner Ann Turner On Sat, Dec 12, 2015 at 11:41 PM, Tim Janzen via <genealogy-dna@rootsweb.com > wrote: > Dear David, > DD means a deletion and II means an insertion. The "i" SNPs in the > 23andMe files are those that don't have rs numbers assigned to them by > 23andMe. It is possible that "i" stands for Illumina, but I am not certain > about that. It is also possible that it stands for "inserted", possibly > because 23andMe inserted these SNPs onto the SNP chip because they were of > special interest to 23andMe. Someone at 23andMe would know the answer to > this question. > It is interesting that AncestryDNA files don't have SNPs with the > allele values AT. I don't have a definite answer for that. I checked my > mom's file for the SNPs that have the allele values AT in 23andMe and found > a total of 322 of these SNPs. I then checked for these SNPs in my mom's > AncestryDNA file and I couldn't find any of those SNPs in my mom's > AncestryDNA file. My suspicion is that Ancestry.com has dropped all SNPs > from their dataset with the values AT because they think that the results > may be erroneous. > Sincerely, > Tim Janzen > > -----Original Message----- > From: genealogy-dna-bounces@rootsweb.com > [mailto:genealogy-dna-bounces@rootsweb.com] On Behalf Of David Schroeder > via > Sent: Saturday, December 12, 2015 9:33 PM > To: genealogy-dna@rootsweb.com > Subject: [DNA] My Raw Data Files - Comparison 23andme vs AncestryDNA > > I have tested at both 23andme (V3) and AncestryDNA. I have written a > program > to add the raw data file information into a MySQL database, creating > separate tables for my 23andme results and my AncestryDNA. > > I am trying to understand some things. > > I can understand all the A, C, G, T lettering. The single letters represent > SNPs on my Y and X chromosomes. I also understand that '--' is a no call. > What are 'DD' and 'II'? > > > I also found that AncestryDNA had no 'AT' SNPs for me, but 23andme had 611: > > Can anyone explain why I have no 'AT' SNP pairs in my AncestryDNA raw data > file? I verified this by browsing my Ancestry Raw data file. I had every > other SNP pair represented. > > The final question is about RSIDs. What are the ones that begin with 'i' in > my 23andme raw data file? I have 10,709 RSIDs that begin with 'i-----'. > > David > > > ------------------------------- > To unsubscribe from the list, please send an email to > GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without > the quotes in the subject and the body of the message >

    12/14/2015 03:47:43
    1. [DNA] Kudos to Gedmatch (was: Are FTDNA's 1cM matches shown in ICW as well?)
    2. Andreas West via
    3. I think we should all be glad for the work that the guys at Gedmatch have done to allow us to run triangulations against such a large number of matches. If triangulation would be as easy as some mistakenly understand it (those that believe ICW is triangulation) then more websites would support us in doing so and I'm pretty sure FTDNA would provide us with a proper triangulation tool (again, it would be nice if they would be more clear that their ICW tool isn't triangulation but hey, it's all about marketing and keeping the impression you do). It's by no means a small feature and requires an enormous amount of disc space and computing power to run all possible combinations. Just to illustrate, with 1,000,000 customer (both 23andme and Ancestry have more now) you have 499,999,500,000 possible combinations to check for. Given you have to check for atDNA and X-DNA (two different functions/procedures most likely given they are based on different minimum matching criteria) that means those two companies would have to run currently more than 1 quadrillion (had to look that name up) of checks if they would automate triangulation (if they start from scratch, obviously they have done a lot of triangulations already). Hence we're only provided with a list of matches and it's up to us to find out which ones triangulate and which not. For matching they probably have some optimization along the way but still we're talking about comparing 800k+ SNP's per customer (again with optimization steps in between like described in Ancestry's algorithm) to identify who's matching who and leave it then to the customer to decide when to do triangulation of up to 5 people vs one other person (in the case of 23andme). I hope that people now understand what a huge task that is and why only very few are undertaking the task of identifying matches (with Gedmatch being the only company run by just a couple of people), let alone run triangulations for you. Andreas > On 14 Dec 2015, at 03:36, Tim Janzen via <genealogy-dna@rootsweb.com> wrote: > > Dear Andreas, > I agree with you that the ADSA tool documentation statement is misleading. I think it should be rewritten to more accurately state what the ICW data does reflect. > Sincerely, > Tim > > -----Original Message----- > From: Andreas West [mailto:ahnen@awest.de] > Sent: Sunday, December 13, 2015 1:39 AM > To: Tim Janzen; genealogy-dna@rootsweb.com > Subject: Re: [DNA] Are FTDNA's 1cM matches shown in ICW as well? > > Thanks Tim and yes, I have two phased sets for each of my parents at GEDmatch. > > Let us all be thankful for them providing the tools that all those multi-million (or even billion) companies are not giving us. > > Also good to hear that you agree with my points. > > Do I understand your "disagree" with the ADSA tool documentation (the statement I posted) as that you agree with me that it's misleading and can't be done with FTDNA data? > > Andreas > > > > ------------------------------- > To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    12/14/2015 05:59:08
    1. Re: [DNA] At what number of matches (at the same loci) are we talking about a pileup?
    2. Andreas West via
    3. Jim, can you elaborate which of Pike's program you're using and for what check specifically? Sorry, I'm way behind in reading your excellent blog, no time lately. If we do identify other pileup areas, what is the criteria and process to get them added to ISOGG wiki? Andreas > On 14 Dec 2015, at 10:32, Jim Bartlett via <genealogy-dna@rootsweb.com> wrote: > > Rebekah, > > I have not. I do have the centromeres noted, the Chr 6 HLA area and the 12 or so areas identified in the ISOGG/wiki, but my pileup areas don't seem to correlate with them. > When I can, I might use Pike's programs to check. > > Thanks for your encouragement - I'm working on Endogamy Part II. > > Jim - www.segmentology.org > >> On Dec 13, 2015, at 9:14 PM, Rebekah Adele Canada via <genealogy-dna@rootsweb.com> wrote: >> >> Jim, >> >> I am loving your segmentology blog more and more. :-) Have you ever >> done checks for high levels of heterozygosity and/or no-calls in the >> places you have pileups? >> >> Note: Though I am employed by the Gene by Gene parent company of >> Family Tree DNA, my opinion above is strictly my own as a community >> member. >> >> --- >> Regards, >> Rebekah A. Canada >> "And they wonder why the maples >> Can't be happy in their shade." Trees (Neil Peart from Rush) >> >> >> On Sun, Dec 13, 2015 at 4:18 PM, Jim Bartlett via >> <genealogy-dna@rootsweb.com> wrote: >>> Barbara >>> >>> I believe all TGs are relavent - each segment of our DNA had to come from some Ancestor. The more Matches in a TG means the larger the family size from the CA and/or the more distant the CA. I think you are correct that some of our CAs are back before an immigrant to America. >>> But there are also "intermediate" cousins that appear in a TG from time to time. In other words, all of the TG Matches aren't necessarily cousins back to the CA, some may be closer cousins. These closer cousins are the ones to look for - the MRCA with them may be a pointer to the distant CA. This is one reason why it's important to contact all Matches and share to find the individual MRCAs. >>> >>> Every TG is from some ancestor. >>> >>> Jim - www.segmentology.org > > > ------------------------------- > To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    12/14/2015 05:27:22
    1. Re: [DNA] At what number of matches (at the same loci) are we talking about a pileup?
    2. Andreas West via
    3. Scary (for us to manage all of them as it means doubling our time as well) but good as well! Andreas > On 14 Dec 2015, at 05:57, Jim Bartlett <jim4bartletts@verizon.net> wrote: > > Another way to look at this is that the number of our Matches is about doubling every 14 months. So 14 months from now each TG will have twice as many Matches. Some may look like pileups, but they are just growing TGs as more Matches are reported. > > Jim - www.segmentology.org > >> On Dec 13, 2015, at 5:56 AM, Andreas West via <genealogy-dna@rootsweb.com> wrote: >> >> Hi everyone, >> >> I hope we can find some consensus here and maybe some of you know even what >> number (of matches at a certain loci) is used by AncestryDNA to identify >> pileups. >> >> We're obviously not talking about 1000 here as that would give us 499500 >> 1-to-1 comparisons to run between the 1000 matches. That's the main reason why >> DTC DNA testing companies (and also GEDmatch) are interested to identify >> pileups to limit useless calculations (which in the end will still not find a >> single triangulated group (maybe) or it's too far back anyway, see the Timber >> algorithm used by AncestrDNA to cut matches. >> >> a) I have 97 matches (at the same loci) for one of my kits (at the "X" >> chromosome interestingly, it's a female person), which means 4656 >> combinations. Is that number already a pileup? >> >> How about: >> >> b) 52 matches = 1326 combinations >> >> c) 36 matches = 630 combinations >> >> d) 23 matches = 253 combinations >> >> e) 18 matches = 153 combinations >> >> Where is the line to draw? At a, b, c, d, e or where? >> >> >> What is the largest number of matches that you have in your triangulated >> groups? >> >> We obviously don't want to miss out a large TG as it also means a lot of >> people can "crowdsource" together and identify the CA much quicker than a >> group of 3 can (usually means also more family trees to compare with). >> >> Thanks for your answers! >> >> Andreas (WEST) born BASSO >> >>

    12/14/2015 02:20:13
    1. Re: [DNA] At what number of matches (at the same loci) are we talking about a pileup?
    2. Jim Bartlett via
    3. I'm not sure yet - I haven't used them yet. Jim - www.segmentology.org > On Dec 14, 2015, at 12:27 AM, Andreas West <ahnen@awest.de> wrote: > > Jim, can you elaborate which of Pike's program you're using and for what check specifically? > > Sorry, I'm way behind in reading your excellent blog, no time lately. > > If we do identify other pileup areas, what is the criteria and process to get them added to ISOGG wiki? > > Andreas > >> On 14 Dec 2015, at 10:32, Jim Bartlett via <genealogy-dna@rootsweb.com> wrote: >> >> Rebekah, >> >> I have not. I do have the centromeres noted, the Chr 6 HLA area and the 12 or so areas identified in the ISOGG/wiki, but my pileup areas don't seem to correlate with them. >> When I can, I might use Pike's programs to check. >> >> Thanks for your encouragement - I'm working on Endogamy Part II. >> >> Jim - www.segmentology.org >> >>>

    12/14/2015 01:30:23
    1. Re: [DNA] Kudos to Gedmatch (was: Are FTDNA's 1cM matches shown in ICW as well?)
    2. Jim Bartlett via
    3. 23andMe (and GEDmatch) let us compare two Matches to complete the Triangulation process. FTDNA could offer a feature to just confirm which ICW Matches (which the already list) are on the same segment. This would require some additional computing, but not so much as you calculate. Jim - www.segmentology.org > On Dec 14, 2015, at 12:59 AM, Andreas West via <genealogy-dna@rootsweb.com> wrote: > > I think we should all be glad for the work that the guys at Gedmatch have done to allow us to run triangulations against such a large number of matches. > > If triangulation would be as easy as some mistakenly understand it (those that believe ICW is triangulation) then more websites would support us in doing so and I'm pretty sure FTDNA would provide us with a proper triangulation tool (again, it would be nice if they would be more clear that their ICW tool isn't triangulation but hey, it's all about marketing and keeping the impression you do). > > It's by no means a small feature and requires an enormous amount of disc space and computing power to run all possible combinations. > > Just to illustrate, with 1,000,000 customer (both 23andme and Ancestry have more now) you have > > 499,999,500,000 > > possible combinations to check for. > > Given you have to check for atDNA and X-DNA (two different functions/procedures most likely given they are based on different minimum matching criteria) that means those two companies would have to run currently more than 1 quadrillion (had to look that name up) of checks if they would automate triangulation (if they start from scratch, obviously they have done a lot of triangulations already). Hence we're only provided with a list of matches and it's up to us to find out which ones triangulate and which not. > > For matching they probably have some optimization along the way but still we're talking about comparing 800k+ SNP's per customer (again with optimization steps in between like described in Ancestry's algorithm) to identify who's matching who and leave it then to the customer to decide when to do triangulation of up to 5 people vs one other person (in the case of 23andme). > > I hope that people now understand what a huge task that is and why only very few are undertaking the task of identifying matches (with Gedmatch being the only company run by just a couple of people), let alone run triangulations for you. > > Andreas > >> On 14 Dec 2015, at 03:36, Tim Janzen via <genealogy-dna@rootsweb.com> wrote: >> >> Dear Andreas, >> I agree with you that the ADSA tool documentation statement is misleading. I think it should be rewritten to more accurately state what the ICW data does reflect. >> Sincerely, >> Tim >> >> -----Original Message----- >> From: Andreas West [mailto:ahnen@awest.de] >> Sent: Sunday, December 13, 2015 1:39 AM >> To: Tim Janzen; genealogy-dna@rootsweb.com >> Subject: Re: [DNA] Are FTDNA's 1cM matches shown in ICW as well? >> >> Thanks Tim and yes, I have two phased sets for each of my parents at GEDmatch. >> >> Let us all be thankful for them providing the tools that all those multi-million (or even billion) companies are not giving us. >> >> Also good to hear that you agree with my points. >> >> Do I understand your "disagree" with the ADSA tool documentation (the statement I posted) as that you agree with me that it's misleading and can't be done with FTDNA data? >> >> Andreas >> >> >> >> ------------------------------- >> To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message > > > ------------------------------- > To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    12/14/2015 01:28:59
    1. [DNA] Thank you FTDNA
    2. Kevincamp via
    3. It appears that FTDNA has made a change so that some SNP testing now better displays on the participants personal and project pages. Previously, individual SNP testing via Advanced Orders only showed up as a footnote on the Haplotree Page. Now positive SNP results show up on the person's main home page and, more importantly, on their project page. This will go a long way in helping people do analysis of SNP modals. Thank You FTDNA!

    12/14/2015 12:55:10
    1. Re: [DNA] Kudos to Gedmatch (was: Are FTDNA's 1cM matches shown in ICW as well?)
    2. Ann Turner via
    3. FTDNA has cited privacy concerns over the issue of comparing kits to each other. Your consent agreement covers only people who match you. However, if you use a two-step process, first identifying segments where you match multiple people, and second loading those people into the ICW matrix, I think that would be a pretty reliable substitute. Comments, Jim? One caveat is that everyone must reach that 20 cM threshold. I've definitely seen cases where someone fails that test but actually is ICW with other people in the matrix. Note that 23andMe's consent for sharing explicitly mentions that the person you're sharing with will be able to see how you compare with other people he's sharing with. If people opt in to Open Sharing, that will expand the pool of possible comparisons. Ann Turner On Mon, Dec 14, 2015 at 5:28 AM, Jim Bartlett via < genealogy-dna@rootsweb.com> wrote: > 23andMe (and GEDmatch) let us compare two Matches to complete the > Triangulation process. > > FTDNA could offer a feature to just confirm which ICW Matches (which the > already list) are on the same segment. This would require some additional > computing, but not so much as you calculate. > > Jim - www.segmentology.org > > > On Dec 14, 2015, at 12:59 AM, Andreas West via < > genealogy-dna@rootsweb.com> wrote: > > > > I think we should all be glad for the work that the guys at Gedmatch > have done to allow us to run triangulations against such a large number of > matches. > > > > If triangulation would be as easy as some mistakenly understand it > (those that believe ICW is triangulation) then more websites would support > us in doing so and I'm pretty sure FTDNA would provide us with a proper > triangulation tool (again, it would be nice if they would be more clear > that their ICW tool isn't triangulation but hey, it's all about marketing > and keeping the impression you do). > > > > It's by no means a small feature and requires an enormous amount of disc > space and computing power to run all possible combinations. > > > > Just to illustrate, with 1,000,000 customer (both 23andme and Ancestry > have more now) you have > > > > 499,999,500,000 > > > > possible combinations to check for. > > > > Given you have to check for atDNA and X-DNA (two different > functions/procedures most likely given they are based on different minimum > matching criteria) that means those two companies would have to run > currently more than 1 quadrillion (had to look that name up) of checks if > they would automate triangulation (if they start from scratch, obviously > they have done a lot of triangulations already). Hence we're only provided > with a list of matches and it's up to us to find out which ones triangulate > and which not. > > > > For matching they probably have some optimization along the way but > still we're talking about comparing 800k+ SNP's per customer (again with > optimization steps in between like described in Ancestry's algorithm) to > identify who's matching who and leave it then to the customer to decide > when to do triangulation of up to 5 people vs one other person (in the case > of 23andme). > > > > I hope that people now understand what a huge task that is and why only > very few are undertaking the task of identifying matches (with Gedmatch > being the only company run by just a couple of people), let alone run > triangulations for you. > > > > Andreas > > > >> On 14 Dec 2015, at 03:36, Tim Janzen via <genealogy-dna@rootsweb.com> > wrote: > >> > >> Dear Andreas, > >> I agree with you that the ADSA tool documentation statement is > misleading. I think it should be rewritten to more accurately state what > the ICW data does reflect. > >> Sincerely, > >> Tim > >> > >> -----Original Message----- > >> From: Andreas West [mailto:ahnen@awest.de] > >> Sent: Sunday, December 13, 2015 1:39 AM > >> To: Tim Janzen; genealogy-dna@rootsweb.com > >> Subject: Re: [DNA] Are FTDNA's 1cM matches shown in ICW as well? > >> > >> Thanks Tim and yes, I have two phased sets for each of my parents at > GEDmatch. > >> > >> Let us all be thankful for them providing the tools that all those > multi-million (or even billion) companies are not giving us. > >> > >> Also good to hear that you agree with my points. > >> > >> Do I understand your "disagree" with the ADSA tool documentation (the > statement I posted) as that you agree with me that it's misleading and > can't be done with FTDNA data? > >> > >> Andreas > >> > >> > >> > >> ------------------------------- > >> To unsubscribe from the list, please send an email to > GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without > the quotes in the subject and the body of the message > > > > > > ------------------------------- > > To unsubscribe from the list, please send an email to > GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without > the quotes in the subject and the body of the message > > > ------------------------------- > To unsubscribe from the list, please send an email to > GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without > the quotes in the subject and the body of the message >

    12/14/2015 12:15:44
    1. Re: [DNA] At what number of matches (at the same loci) are we talking about a pileup?
    2. Andreas West via
    3. Thanks Karla, that information is very useful for me (and surprising). I wouldn't have thought that one has a TG that large. Good that you stopped comparing everyone with everyone, with 35 matches these are 595 One-to-One on atDNA and the same number on X-DNA. Hope you will one day identify the common ancestor or at least get it down to a location Thanks again, Andreas > On 13 Dec 2015, at 22:16, Karla Huebner <calypsospots@gmail.com> wrote: > > I have a group of 35 on chromosome 6 starting around 87,000,00 and ending around 99,000,000 in terms of where they match my brother and me. I used to check each one against everyone else in the group, but quit and now just check against several in the group to triangulate.The people who match are from the US, UK, and Australia (so initially I expected this to be on my mother's side), and some have a preponderance of colonial ancestry. > > Once I was able to confirm that this is a segment from my paternal grandmother (testing one of her nieces, who is half Norwegian and half Swedish), I formed the hypothesis that this is an old bit of Finnish DNA. Why? My grandmother's parents were both Norwegian, but one has proved to have substantial Forest Finn ancestry, causing us to match a lot of Finns. Finns went to Delaware a couple hundred years ago, so I think that probably explains the colonial US aspect. > > The other possibility is that my brother's and my having two Norwegian grandmothers, and our relative being half Norwegian and half Swedish, could cause us to match in miscellaneous ways with a lot of people (IBS), but as the matches triangulate well with others on their part of the segment (these people match us around 7-8 cM apiece), I'm inclined to think it is a scrap of IBD old Finnish. > >> On Sun, Dec 13, 2015 at 5:56 AM, Andreas West via <genealogy-dna@rootsweb.com> wrote: >> Hi everyone, >> >> I hope we can find some consensus here and maybe some of you know even what >> number (of matches at a certain loci) is used by AncestryDNA to identify >> pileups. >> >> We're obviously not talking about 1000 here as that would give us 499500 >> 1-to-1 comparisons to run between the 1000 matches. That's the main reason why >> DTC DNA testing companies (and also GEDmatch) are interested to identify >> pileups to limit useless calculations (which in the end will still not find a >> single triangulated group (maybe) or it's too far back anyway, see the Timber >> algorithm used by AncestrDNA to cut matches. >> >> a) I have 97 matches (at the same loci) for one of my kits (at the "X" >> chromosome interestingly, it's a female person), which means 4656 >> combinations. Is that number already a pileup? >> >> How about: >> >> b) 52 matches = 1326 combinations >> >> c) 36 matches = 630 combinations >> >> d) 23 matches = 253 combinations >> >> e) 18 matches = 153 combinations >> >> Where is the line to draw? At a, b, c, d, e or where? >> >> >> What is the largest number of matches that you have in your triangulated >> groups? >> >> We obviously don't want to miss out a large TG as it also means a lot of >> people can "crowdsource" together and identify the CA much quicker than a >> group of 3 can (usually means also more family trees to compare with). >> >> Thanks for your answers! >> >> Andreas (WEST) born BASSO >> >> My ancestors: [http://www.wikitree.com/genealogy/Basso-Family- >> Tree-23](http://www.wikitree.com/genealogy/Basso-Family-Tree-23) >> >> ------------------------------- >> To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message > > > > -- > Karla Huebner > calypsospots AT gmail.com

    12/13/2015 05:34:20
    1. Re: [DNA] At what number of matches (at the same loci) are we talking about a pileup?
    2. B Griffiths via
    3. Hello Jim, Thank you - yes, that's my view as well. And, in a few cases, I have been able to find apparent "closer cousins", within a TG (based on the additional DNA they share) although none of them have yet identified their MRCA in order to narrow down the likely ancestral lines for the rest of us in the TG. Finding a genealogical connection is much more difficult than building up TGs! Barbara On 13 December 2015 at 22:18, Jim Bartlett <jim4bartletts@verizon.net> wrote: > Barbara > > I believe all TGs are relavent - each segment of our DNA had to come from some Ancestor. The more Matches in a TG means the larger the family size from the CA and/or the more distant the CA. I think you are correct that some of our CAs are back before an immigrant to America. > But there are also "intermediate" cousins that appear in a TG from time to time. In other words, all of the TG Matches aren't necessarily cousins back to the CA, some may be closer cousins. These closer cousins are the ones to look for - the MRCA with them may be a pointer to the distant CA. This is one reason why it's important to contact all Matches and share to find the individual MRCAs. > > Every TG is from some ancestor. > > Jim - www.segmentology.org > >> On Dec 13, 2015, at 2:35 PM, B Griffiths via <genealogy-dna@rootsweb.com> wrote: >> >> Hello Andreas >> >>> What is the largest number of matches that you have in your triangulated >>> groups? >> >> My largest group is 67 people matching me (and my mother) on >> chromosome 8 within the area between 109000000 - 128000000. >> >> I would have done some cross checking when each of them initially >> shared with me and most of them I have marked as cross matching, >> although not everyone matches everyone else. I have just rechecked >> two of them - one matches 62 of the group (this person matches me from >> 109000000 - 126000000, 16.8cM), the other 50 (they match me from >> 117000000 - 125000000, 10cM). >> >> Even the person with the smallest match to me in this area >> (118000000-123000000, 5.3cM) matches 34 of the others, his longest >> match to any of them being 15.5cM, over a segment 103000000 - >> 123000000. >> >> I wish it were true that the large groups of 'people can "crowdsource" >> together and identify the CA much quicker than a group of 3 can ' . >> That had been my hope but, in most of my larger TGs, the majority of >> the people match each other over just one segment and so all seem to >> be equally distant from each other. Whereas the first match I had, >> where we identified a common ancestor, was actually the only match on >> that segment. I suspect (as I am in the UK), that the larger TGs are >> where a distant ancestor emigrated (usually to the US) so long ago >> that they now have many descendants, who just happen to be DNA tested. >> >> As to whether such groups class as "pile ups", I don't know - but if >> you start discounting them as being too distant, what's the point of >> triangulation? Where is the boundary between a "relevant" TG and a >> "population based" TG? >> >> My working principle, for now, is that if people in the group cross >> match, then the group is genuine/relevant. >> Best wishes >> Barbara Griffiths >> >> >> >> >> >> On 13 December 2015 at 10:56, Andreas West via >> <genealogy-dna@rootsweb.com> wrote: >>> Hi everyone, >>> >>> I hope we can find some consensus here and maybe some of you know even what >>> number (of matches at a certain loci) is used by AncestryDNA to identify >>> pileups. >>> >>> We're obviously not talking about 1000 here as that would give us 499500 >>> 1-to-1 comparisons to run between the 1000 matches. That's the main reason why >>> DTC DNA testing companies (and also GEDmatch) are interested to identify >>> pileups to limit useless calculations (which in the end will still not find a >>> single triangulated group (maybe) or it's too far back anyway, see the Timber >>> algorithm used by AncestrDNA to cut matches. >>> >>> a) I have 97 matches (at the same loci) for one of my kits (at the "X" >>> chromosome interestingly, it's a female person), which means 4656 >>> combinations. Is that number already a pileup? >>> >>> How about: >>> >>> b) 52 matches = 1326 combinations >>> >>> c) 36 matches = 630 combinations >>> >>> d) 23 matches = 253 combinations >>> >>> e) 18 matches = 153 combinations >>> >>> Where is the line to draw? At a, b, c, d, e or where? >>> >>> >>> What is the largest number of matches that you have in your triangulated >>> groups? >>> >>> We obviously don't want to miss out a large TG as it also means a lot of >>> people can "crowdsource" together and identify the CA much quicker than a >>> group of 3 can (usually means also more family trees to compare with). >>> >>> Thanks for your answers! >>> >>> Andreas (WEST) born BASSO >>> >>> My ancestors: [http://www.wikitree.com/genealogy/Basso-Family- >>> Tree-23](http://www.wikitree.com/genealogy/Basso-Family-Tree-23) >>> >>> ------------------------------- >>> To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message >> >> ------------------------------- >> To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    12/13/2015 05:21:27
    1. [DNA] Pile ups
    2. Andreas West via
    3. Hi David, Thank you for sharing your experience. I have to correct one assumption in your post though. Not all people share the same pile ups. The person I talked about in my post is of Asian origin and she has pileups at different loci whereas she is missing the typical chromosome 2 Western Atlantic Autosomal Haplotype (WAAH) that most of us have. Andreas > On 13 Dec 2015, at 22:30, David Hamill via <genealogy-dna@rootsweb.com> wrote: > > I thought I would share my experience with a pile-up because it was instructive for me. > > I was building a triangulation group and started getting suspicious because too many people matched the segment. I apologize for forgetting the details here. By too many I mean something like 25% of the people on gedmatch when I did 1-1 comparisons with kits I suspected of being matches. I was looking at a relatively short segment (3-4cM?) so all these matches didn’t show up in the standard one-to-many search. I remembered the “pile-up” phenomenon and though I didn’t actually know what it was, it sounded like it might be an explanation of what was going on. I started checking kits totally at random and found a similar high frequency of matches. > > Researching Pile-ups I found that it refers to the phenominum where the percentage of people that have a particular segment is unreasonable based on any possible relatedness. The plausible explanation is that this genetic combination is favored by natural selection. > > Next I just looked for articles on pile-ups, and sure enough, the area where my too-frequent matches occurred was one that had been identified in several studies as one where these “pile-ups” are found. > > The point here is that pile-ups are not an aspect of ones particular group of matches but occur at the same places for everyone. I don’t know of a central compendium of pile-up areas that have been identified, but it would be nice if there was one. > > Maybe one quick and dirty way to see if a segment that appears to match for too many relatives is the result of a pile up, would be to see if it occurs in a similar frequency in both our relatives and kits selected at random…. Thats what tipped me off. > > In terms of the argument that it is caused by natural selection, what makes sense to me is that people for whom this area gets recombined due to crossovers are at a disadvantage (less likely to survive or reproduce) compared to people for home this region is intact. After all our DNA does have a job to do! > > David > > PS Of course just ignoring short segments in my case, I would have avoided the issue. I think what happened was I started with a few matches for a segment that included the pile-up area, and started looking for kits with related surnames who I thought might have a short piece of the same segment. And too many did….. > > >> On Dec 13, 2015, at 8:26 AM, genealogy-dna-request@rootsweb.com wrote: >> >> Message: 3 >> Date: Sun, 13 Dec 2015 10:56:24 -0000 >> From: Andreas West <ahnen@awest.de> >> Subject: [DNA] At what number of matches (at the same loci) are we >> talking about a pileup? >> To: genealogy-dna <genealogy-dna@rootsweb.com> >> Message-ID: >> <hJSp0t3pb-zqwGRIfEOeuWFWvSJGGA9D-CxJbhscXGdALUpl0@smtp.1und1.de> >> Content-Type: text/plain; charset=us-ascii; format=flowed >> >> Hi everyone, >> >> I hope we can find some consensus here and maybe some of you know even what >> number (of matches at a certain loci) is used by AncestryDNA to identify >> pileups. >> >> We're obviously not talking about 1000 here as that would give us 499500 >> 1-to-1 comparisons to run between the 1000 matches. That's the main reason why >> DTC DNA testing companies (and also GEDmatch) are interested to identify >> pileups to limit useless calculations (which in the end will still not find a >> single triangulated group (maybe) or it's too far back anyway, see the Timber >> algorithm used by AncestrDNA to cut matches. >> >> a) I have 97 matches (at the same loci) for one of my kits (at the "X" >> chromosome interestingly, it's a female person), which means 4656 >> combinations. Is that number already a pileup? >> >> How about: >> >> b) 52 matches = 1326 combinations >> >> c) 36 matches = 630 combinations >> >> d) 23 matches = 253 combinations >> >> e) 18 matches = 153 combinations >> >> Where is the line to draw? At a, b, c, d, e or where? >> >> >> What is the largest number of matches that you have in your triangulated >> groups? >> >> We obviously don't want to miss out a large TG as it also means a lot of >> people can "crowdsource" together and identify the CA much quicker than a >> group of 3 can (usually means also more family trees to compare with). >> >> Thanks for your answers! >> >> Andreas (WEST) born BASSO >> >> My ancestors: [http://www.wikitree.com/genealogy/Basso-Family- >> Tree-23](http://www.wikitree.com/genealogy/Basso-Family-Tree-23) > > > > ------------------------------- > To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    12/13/2015 04:36:40
    1. Re: [DNA] Are FTDNA's 1cM matches shown in ICW as well?
    2. Andreas West via
    3. Thanks Roberta, I think this is what Jim also meant. It only shows matching segments down to 1cM for matches that qualify both the 7.7cM minimum segment length and 20cM total cM. That should be the correct statement. I'm outside and don't have the article in front of me but as the main statement of it is "looking at small segments less than 7 or at minimum 5cM is rubbish" it forgets that we only look at those smaller segments in the same way that FTDNA does. At least I do, that means first it must be a TG proven by 3 or more people with matching segments > 5cM (or better 7cM) and then I look at those in the group that don't match to some in the TG. So far I could always prove that the original, bigger segment was chopped into smaller parts for some people but clearly still within the borders of the original TG. Thanks everyone, I really love the wisdom of this mailing list! Andreas > On 13 Dec 2015, at 21:24, Roberta Estes <robertajestes@att.net> wrote: > > Hi Andreas, > > No one appears on your match list that does not match or exceed the minimum > match criteria at Family Tree DNA, which is currently 20cM total and 7cM for > an individual segment. If someone does not match both criteria, they will > not be on your match list and they will now show on the ICW list either. IF > you match at or above that level, you can see your matching cM down to 1cM > to any individual. > > The ICW page is just another way to see your matches. No one will appear on > your ICW list that does not already appear on your match list. > > Roberta Estes > > -----Original Message----- > From: genealogy-dna-bounces@rootsweb.com > [mailto:genealogy-dna-bounces@rootsweb.com] On Behalf Of Andreas West via > Sent: Sunday, December 13, 2015 12:40 AM > To: DNA Genealogy Mailing List > Subject: [DNA] Are FTDNA's 1cM matches shown in ICW as well? > > Hi everyone, > > > was just stumbling across a post on WikiTree forum on DNAGedcom being able > to triangulate data from GEDmatch and FTDNA. Leaving GEDmatch out for this > thread (I think the OP refers to the paid tier 1 option which gives access > to the top > 200 triangulated matches and downloading that screen result - please correct > me if there is another way) I want to question that remark about FTDNA. > > My assumption is that ICW (which isn't triangulation, can't repeat that > often enough as there are still people out there who don't understand the > difference or that there is one to start with) means that it includes all > those 1cM matches (and larger) that FTDNA reports as a minimum criteria. > > Can someone confirm this? > > That makes the ICW tool even more worrisome if it's true as it's not even > clear if a triangulated segment of only 1cM (or 2cM for that matter) is > indeed an ancestral segment, triangulation or not (I refer to various > discussions we had here and on other email lists about this - no need to > start that discussion again until we have more proof). > > Andreas (WEST) born BASSO > > My ancestors: [http://www.wikitree.com/genealogy/Basso-Family- > Tree-23](http://www.wikitree.com/genealogy/Basso-Family-Tree-23) > > ------------------------------- > To unsubscribe from the list, please send an email to > GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the > quotes in the subject and the body of the message >

    12/13/2015 04:25:16
    1. Re: [DNA] Are FTDNA's 1cM matches shown in ICW as well?
    2. Andreas West via
    3. Jim, Thanks for answering my original question. However it was stated in the original article link I posted that FTDNA uses matches down to 1cM. So if not for ICW, what are these matches then used for? Do you have any written evidence from them that ICW (in the matrix tool I mean) uses the same criteria as for showing up in our match list first of all? Andreas > On 13 Dec 2015, at 21:15, Jim Bartlett <jim4bartletts@verizon.net> wrote: > > Andreas > > The FTDNA ICW report uses the same criteria for a Match that FTDNA uses for your Match list: 7.7cM segment, 20cM total, etc. So all of your ICW list with Match A meets this criteria for you (everyone on the ICW list is already listed as a Match for you) AND each person on that ICW list meets the same criteria for Match A (is already on the Match list for A). So no one is talking about matching based on 1 or 2cM. > > Every shared segment over a threshold (7.7cM, etc for FTDNA) has to be somewhere. When such segments for Matches A and B significantly overlap on the same chromosome in your spreadsheet, they are on the same chromosome OR on opposite chromosomes OR one or both are IBS. If they were IBS or on opposite chromosomes they are rarely ICW. Using the 7.7cM threshold, the FTDNA ICW method for triangulation works a very high percentage of the time. Getting a confirmation from either A or B make it as firm as any other TG. > I don't see 1cM segments in this process. > > Jim - www.segmentology.org > >> On Dec 13, 2015, at 12:40 AM, Andreas West via <genealogy-dna@rootsweb.com> wrote: >> >> Hi everyone, >> >> >> was just stumbling across a post on WikiTree forum on DNAGedcom being able to >> triangulate data from GEDmatch and FTDNA. Leaving GEDmatch out for this thread >> (I think the OP refers to the paid tier 1 option which gives access to the top >> 200 triangulated matches and downloading that screen result - please correct >> me if there is another way) I want to question that remark about FTDNA. >> >> My assumption is that ICW (which isn't triangulation, can't repeat that often >> enough as there are still people out there who don't understand the difference >> or that there is one to start with) means that it includes all those 1cM >> matches (and larger) that FTDNA reports as a minimum criteria. >> >> Can someone confirm this? >> >> That makes the ICW tool even more worrisome if it's true as it's not even >> clear if a triangulated segment of only 1cM (or 2cM for that matter) is indeed >> an ancestral segment, triangulation or not (I refer to various discussions we >> had here and on other email lists about this - no need to start that >> discussion again until we have more proof). >> >> Andreas (WEST) born BASSO >>

    12/13/2015 04:18:14
    1. Re: [DNA] At what number of matches (at the same loci) are we talking about a pileup?
    2. Jim Bartlett via
    3. Rebekah, I have not. I do have the centromeres noted, the Chr 6 HLA area and the 12 or so areas identified in the ISOGG/wiki, but my pileup areas don't seem to correlate with them. When I can, I might use Pike's programs to check. Thanks for your encouragement - I'm working on Endogamy Part II. Jim - www.segmentology.org > On Dec 13, 2015, at 9:14 PM, Rebekah Adele Canada via <genealogy-dna@rootsweb.com> wrote: > > Jim, > > I am loving your segmentology blog more and more. :-) Have you ever > done checks for high levels of heterozygosity and/or no-calls in the > places you have pileups? > > Note: Though I am employed by the Gene by Gene parent company of > Family Tree DNA, my opinion above is strictly my own as a community > member. > > --- > Regards, > Rebekah A. Canada > "And they wonder why the maples > Can't be happy in their shade." Trees (Neil Peart from Rush) > > > On Sun, Dec 13, 2015 at 4:18 PM, Jim Bartlett via > <genealogy-dna@rootsweb.com> wrote: >> Barbara >> >> I believe all TGs are relavent - each segment of our DNA had to come from some Ancestor. The more Matches in a TG means the larger the family size from the CA and/or the more distant the CA. I think you are correct that some of our CAs are back before an immigrant to America. >> But there are also "intermediate" cousins that appear in a TG from time to time. In other words, all of the TG Matches aren't necessarily cousins back to the CA, some may be closer cousins. These closer cousins are the ones to look for - the MRCA with them may be a pointer to the distant CA. This is one reason why it's important to contact all Matches and share to find the individual MRCAs. >> >> Every TG is from some ancestor. >> >> Jim - www.segmentology.org

    12/13/2015 03:32:20