Hello All, My mails to DNA-newbie seem to be taking 3 or 4 days to get through (or at least, they aren't getting back to me). I'm re-posting this on genealogy-dna, since it hasn't (yet) made it through to newbie. Anyway, it might be kinder to the intended newbie audience if this discussion could drift over to some more appropriate forum. Paul Rakow ---------------------------- Original Message ---------------------------- Subject: Re: [DNA-NEWBIE] Anatomy of an IBD segment From: "Paul Rakow" <paul.rakow@cantab.net> Date: Sun, October 25, 2015 16:04 To: DNA-NEWBIE@yahoogroups.com -------------------------------------------------------------------------- There is a classic problem in statistics that can help us out here. In the 19th century, the Prussian army had gathered statistics on how many soldiers in each unit were killed by being kicked by a horse or mule. In most years most units lost no-one, there were units where one soldier died that way, and even a few units that lost two or three men through horse-kicks in a given year. Something called the Poisson distribution did a very good job of predicting the data. We can use the same method to estimate how common it will be for you to have triangulating groups (people who match you, and match each other, at an overlapping location on your chromosome). If we look at people who match us on a single-segment of about 12 cM, on a 44 chromosome genome of about 7200 cM, then the mule-kick model says we would see our first example of a triangle of mutual matches when we have looked at around 30 or 40 matches. The number of overlapping groups grows rapidly as the number of people tested grows - if the number of matches you have doubles, the number that fall into groups increases by a factor of about 4. (In this way, the calculation is very close to the "shared birthday" problem that several people have discussed). If you look at 100 matches, something like 85 of them will be isolated matches, the other 15 will be in groups (mostly triangles, but it is at about this point that you can start seeing groups of 4). With 200 matches, the model says about 145 isolated matches, the other 55 in groups. A model like this has shortcomings, one could improve it by putting in a range of sizes for the single-segment matches. The model does have the big advantage that you can get away from the "almost zero" times "almost infinity" debate. For me, the message of the calculation is that triangles of matches should start showing up quite commonly once you get past 100 matches - long before your genome is filled up with matching segments. Paul Rakow
Thanks, Paul, This would explain what we are actually seeing. As the Matches continue to pour in, we cannot say there is no more room at a segment for more IBD segments to triangulate, just because the math says the probability is very low. Jim - www.segmentology.org > On Oct 27, 2015, at 8:13 AM, Paul Rakow via <genealogy-dna@rootsweb.com> wrote: > > > Hello All, > My mails to DNA-newbie seem to be taking 3 or 4 days > to get through (or at least, they aren't getting back to me). > > I'm re-posting this on genealogy-dna, since it hasn't (yet) made > it through to newbie. Anyway, it might be kinder to the intended > newbie audience if this discussion could drift over to some more > appropriate forum. > > Paul Rakow > > > ---------------------------- Original Message ---------------------------- > Subject: Re: [DNA-NEWBIE] Anatomy of an IBD segment > From: "Paul Rakow" <paul.rakow@cantab.net> > Date: Sun, October 25, 2015 16:04 > To: DNA-NEWBIE@yahoogroups.com > -------------------------------------------------------------------------- > > There is a classic problem in statistics that can help us out here. > In the 19th century, the Prussian army had gathered statistics on how > many soldiers in each unit were killed by being kicked by a horse or mule. > In most years most units lost no-one, there were units where one soldier > died that way, and even a few units that lost two or three > men through horse-kicks in a given year. Something called the Poisson > distribution did a very good job of predicting the data. > > We can use the same method to estimate how common it will be for > you to have triangulating groups (people who match you, and match each > other, at an overlapping location on your chromosome). > > If we look at people who match us on a single-segment of about > 12 cM, on a 44 chromosome genome of about 7200 cM, then the mule-kick > model says we would see our first example of a triangle of mutual > matches when we have looked at around 30 or 40 matches. > > The number of overlapping groups grows rapidly as the number of > people tested grows - if the number of matches you have doubles, > the number that fall into groups increases by a factor of about 4. > (In this way, the calculation is very close to the "shared birthday" > problem that several people have discussed). > > If you look at 100 matches, something like 85 of them will > be isolated matches, the other 15 will be in groups (mostly triangles, > but it is at about this point that you can start seeing groups of 4). > > With 200 matches, the model says about 145 isolated matches, > the other 55 in groups. > > A model like this has shortcomings, one could improve it by > putting in a range of sizes for the single-segment matches. The > model does have the big advantage that you can get away from the > "almost zero" times "almost infinity" debate. > > For me, the message of the calculation is that triangles > of matches should start showing up quite commonly once you get > past 100 matches - long before your genome is filled up with > matching segments. > > Paul Rakow > > > > > > ------------------------------- > To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message