The vast majority of matches for unphased female samples on the One-to-Many X comparison tool are NOT verifiable using the X One-to-One comparison tool. Using the statistics at the bottom of the X One-to-Many tool, it appears that female samples generally end up with around 20,000 such matches. Males, however, end up with only one or two hundred. And, in my experience, so do phased female kits. This result is of course illogical, and it points in the direction of a very significant problem with the X comparison algorithm used in the One-to-Many tool. Whatever the differences in the algorithms used in these tools, I fail to see any way that the differences can be useful! John McCoy ([email protected]) In a message dated 10/8/2017 7:29:16 PM Pacific Standard Time, [email protected] writes: I'm wondering if there's some sort of database error here. I'd suggest sending your findings to [email protected] Ann Turner On Sun, Oct 8, 2017 at 5:54 AM, Wesley Johnston <[email protected]> wrote: > I am looking to see just what a high X-match that has no shared atDNA can > tell us about shared ancestral lines. Clearly this is pushing back very far > in time. I am using a female kit that is of pure Czech ancestry as far back > as we know (roughly in the late 1600's and early 1700's). On her 6-gen > X-inheritance chart, we know at least the name of 9 of the 13 X-ancestors > and know the spouse's surname in all 13. So the challenge of finding a > matching line is narrowed to Czech or Slavic lines. > > But that is not the problem that I am running into on GEDmatch. The > problem is that different GEDmatch tools are presenting conflicting > results. The kits come from different test companies, so that I am not sure > if this is a factor. > > To allow anyone wanting to repeat this to do so, I am going to include kit > numbers. The kit of the focus person is T685302. > > Here are the steps that I have done. > 1 - Identify X-DNA matches, sorted by largest X-DNA cM. > I used the free one-to-many and then sort descending on largest X-DNA cM. > Setting aside immediate family members, the top 10 matches range from 16 to > 21 largest X-DNA cMs, and all but the top one show zero total atDNA cMs > (the top one M052745 has 5.1). > 2 - Do a one-to-one X-DNA comparison of some of the top 10. > Clicking on the X in the X-DNA Details column, you would expect to see the > same information shown in the one-to-one comparison as in the sorted > one-to-many. And in some cases, the X 1:1 does show about the same (an 18.8 > largest X-DNA A916434 on the 1:many shows as 18.7 on the X 1:1). But in > some cases, the X-DNA 1:1 shows zero shared X-DNA (an 18.3 largest X-DNA > M131017 on the 1:many shows as no match at all on the X 1:1). > So I tried dropping the SNP threshold with M131017 on the X 1:1 to 100 > SNPs and the minimum cMs to 1 (which should not be necessary if there is > indeed a largest X shared 18.3 cM as the 1:many list shows). And this does > generate a list of shared ranges on the X, the largest of which is 5.4 cM > (162 SNPs) by cMs or by SNPs 3.5 cM (295 SNPs). So clearly this is a > radical difference from what the 1:many list is showing when it reports a > largest shared X of 18.3 cM). > 3 - Create a Tag Group for the highest matches. > I used the Tier1 1 1:many tool to create the group, setting the option to > filter by X (instead of autosomal) with offset 0, limit 500 (the minimum > allowed) and cM size 3 (the minimum allowed. This gives a completely > different (from step 1) top 10 list. Where the free 1:many had largest cMs > of 16 or more for the top 10, the Tier1 1:many showed no one (other than > close family) with a largest cM of more than 11.2. > > So I then tried the Tier1 1:1 on autosomal and setting to the minima and > sorting on the largest X (trying to re-create the list in step 1. But since > the Tier1 1:1 does not include atDNA matches of zero, neither one of these > methods was able to find the same high largest X but low atDNA matches that > the free 1:1 tool found. > Nevertheless, I created a tag group of those X-matches (from the atDNA > version of the Tier1 1:1) who had the highest total X cMS (20.0 and above, > minus immediate family). This tag group has 20 members. > 4 - Run the X-DNA Matrix Comparison for this tag group. > This was a real shock. Since the tag group had been specifically created > from a 1:many list of the X-DNA matches of T685302 who had 20.0 or more > total cMs, you would expect to see a number in every cell of T685302's > row/column. But 8 of the 19 cells are empty. Going back to the Tier1 1:1 > list used to create this tag group, one of these empty cells is the second > highest total cMs (40.8), but the largest for this match is just 6.4 and > thus under the threshold of 7 cM for the matrix, which explains the empty > cell. > > > The bottom line seems to me to be implicit and explicit threshold > differences among the tools. I really would like to work with the top 10 > from the free 1:many, but even the free X 1:1 conflicts with some of the > ones shown in that list. And the Tier1 1:many does not even discover these > matches (apparently because the Tier1, even when choosing the X option) > still has some implicit atDNA threshold). > Choosing these low-autosomal / high largest X matches as a target is a > challenge in the first place. But the GEDmatch tools make it extremely more > challenging when they do not allow exploration of those matches. > Clearly having more ability to twist the knobs on GEDmatch would help. But > the fact that (step 1 and step 2) there are very significant conflicts > between one GEDmatch tool showing 18.3 cM of a largest shared X and > another showing nothing more than 5.4 cM for the same two people seems to > be more than just finding the right settings of the knobs. > > > > > > > ------------------------------- > To unsubscribe from the list, please send an email to > [email protected] with the word 'unsubscribe' without > the quotes in the subject and the body of the message ------------------------------- To unsubscribe from the list, please send an email to [email protected] with the word 'unsubscribe' without the quotes in the subject and the body of the message
I agree that phasing reduces the number of matches enormously, for the autosomes as well as the X in females. I did think the length of the segment reported by Wesley should have been long enough to show up in the one-to-one. Ann Turner On Sun, Oct 8, 2017 at 8:41 PM, <[email protected]> wrote: > The vast majority of matches for unphased female samples on the > One-to-Many X comparison tool are NOT verifiable using the X One-to-One > comparison tool. Using the statistics at the bottom of the X One-to-Many > tool, it appears that female samples generally end up with around 20,000 > such matches. Males, however, end up with only one or two hundred. And, > in my experience, so do phased female kits. This result is of course > illogical, and it points in the direction of a very significant problem > with the X comparison algorithm used in the One-to-Many tool. Whatever the > differences in the algorithms used in these tools, I fail to see any way > that the differences can be useful! > > John McCoy > ([email protected]) > > In a message dated 10/8/2017 7:29:16 PM Pacific Standard Time, > [email protected] writes: > > > I'm wondering if there's some sort of database error here. I'd suggest > sending your findings to [email protected] > > Ann Turner > > On Sun, Oct 8, 2017 at 5:54 AM, Wesley Johnston <[email protected]> > wrote: > > > I am looking to see just what a high X-match that has no shared atDNA can > > tell us about shared ancestral lines. Clearly this is pushing back very > far > > in time. I am using a female kit that is of pure Czech ancestry as far > back > > as we know (roughly in the late 1600's and early 1700's). On her 6-gen > > X-inheritance chart, we know at least the name of 9 of the 13 X-ancestors > > and know the spouse's surname in all 13. So the challenge of finding a > > matching line is narrowed to Czech or Slavic lines. > > > > But that is not the problem that I am running into on GEDmatch. The > > problem is that different GEDmatch tools are presenting conflicting > > results. The kits come from different test companies, so that I am not > sure > > if this is a factor. > > > > To allow anyone wanting to repeat this to do so, I am going to include > kit > > numbers. The kit of the focus person is T685302. > > > > Here are the steps that I have done. > > 1 - Identify X-DNA matches, sorted by largest X-DNA cM. > > I used the free one-to-many and then sort descending on largest X-DNA cM. > > Setting aside immediate family members, the top 10 matches range from 16 > to > > 21 largest X-DNA cMs, and all but the top one show zero total atDNA cMs > > (the top one M052745 has 5.1). > > 2 - Do a one-to-one X-DNA comparison of some of the top 10. > > Clicking on the X in the X-DNA Details column, you would expect to see > the > > same information shown in the one-to-one comparison as in the sorted > > one-to-many. And in some cases, the X 1:1 does show about the same (an > 18.8 > > largest X-DNA A916434 on the 1:many shows as 18.7 on the X 1:1). But in > > some cases, the X-DNA 1:1 shows zero shared X-DNA (an 18.3 largest X-DNA > > M131017 on the 1:many shows as no match at all on the X 1:1). > > So I tried dropping the SNP threshold with M131017 on the X 1:1 to 100 > > SNPs and the minimum cMs to 1 (which should not be necessary if there is > > indeed a largest X shared 18.3 cM as the 1:many list shows). And this > does > > generate a list of shared ranges on the X, the largest of which is 5.4 cM > > (162 SNPs) by cMs or by SNPs 3.5 cM (295 SNPs). So clearly this is a > > radical difference from what the 1:many list is showing when it reports a > > largest shared X of 18.3 cM). > > 3 - Create a Tag Group for the highest matches. > > I used the Tier1 1 1:many tool to create the group, setting the option to > > filter by X (instead of autosomal) with offset 0, limit 500 (the minimum > > allowed) and cM size 3 (the minimum allowed. This gives a completely > > different (from step 1) top 10 list. Where the free 1:many had largest > cMs > > of 16 or more for the top 10, the Tier1 1:many showed no one (other than > > close family) with a largest cM of more than 11.2. > > > > So I then tried the Tier1 1:1 on autosomal and setting to the minima and > > sorting on the largest X (trying to re-create the list in step 1. But > since > > the Tier1 1:1 does not include atDNA matches of zero, neither one of > these > > methods was able to find the same high largest X but low atDNA matches > that > > the free 1:1 tool found. > > Nevertheless, I created a tag group of those X-matches (from the atDNA > > version of the Tier1 1:1) who had the highest total X cMS (20.0 and > above, > > minus immediate family). This tag group has 20 members. > > 4 - Run the X-DNA Matrix Comparison for this tag group. > > This was a real shock. Since the tag group had been specifically created > > from a 1:many list of the X-DNA matches of T685302 who had 20.0 or more > > total cMs, you would expect to see a number in every cell of T685302's > > row/column. But 8 of the 19 cells are empty. Going back to the Tier1 1:1 > > list used to create this tag group, one of these empty cells is the > second > > highest total cMs (40.8), but the largest for this match is just 6.4 and > > thus under the threshold of 7 cM for the matrix, which explains the empty > > cell. > > > > > > The bottom line seems to me to be implicit and explicit threshold > > differences among the tools. I really would like to work with the top 10 > > from the free 1:many, but even the free X 1:1 conflicts with some of the > > ones shown in that list. And the Tier1 1:many does not even discover > these > > matches (apparently because the Tier1, even when choosing the X option) > > still has some implicit atDNA threshold). > > Choosing these low-autosomal / high largest X matches as a target is a > > challenge in the first place. But the GEDmatch tools make it extremely > more > > challenging when they do not allow exploration of those matches. > > Clearly having more ability to twist the knobs on GEDmatch would help. > But > > the fact that (step 1 and step 2) there are very significant conflicts > > between one GEDmatch tool showing 18.3 cM of a largest shared X and > > another showing nothing more than 5.4 cM for the same two people seems to > > be more than just finding the right settings of the knobs. > > > > > > > > > > > > > > ------------------------------- > > To unsubscribe from the list, please send an email to > > [email protected] with the word 'unsubscribe' without > > the quotes in the subject and the body of the message > > ------------------------------- > To unsubscribe from the list, please send an email to > [email protected] with the word 'unsubscribe' without > the quotes in the subject and the body of the message > > ------------------------------- > To unsubscribe from the list, please send an email to > [email protected] with the word 'unsubscribe' without > the quotes in the subject and the body of the message