RootsWeb.com Mailing Lists
Total: 5/5
    1. Re: Richard III DNA Investigation
    2. taf
    3. On Sunday, August 27, 2017 at 10:38:10 AM UTC-7, Stewart Baldwin wrote: > On 8/26/2017 4:03 PM, taf wrote: > > In fact, the smallest divisible segments are probably in the 10s to > > 100s of thousands of bases, which is what puts a definite limit on the > > number of generations over which autosomal is likely to be informative > > without a huge amount of luck (for every ancestor from 1600 from whom > > you have a detectable preserved block, you have many more ancestors > > from whom you inherit no DNA whatsoever). These blocks pass intact for > > an incredibly long time, the block that includes the gene determining > > the most common form of blue eyes is about 150,000bp long and to have > > passed largely intact for more than 12,000 years. This presents a > > problem on two sides - relatively close relative may not share the > > block at all. If two people do share the block, it shows they are > > related, but perhaps too distantly to be genealogically relevant. > > Doesn't the "centimorgan" (cM) measurement take this partially into > account?  If I understand the method correctly, one cM will sometimes > contain a huge number of base pairs, and sometimes a relatively small > number (if it is contained in a "hot spot").  So, wouldn't a long > segment that passed on intact (or essentially intact) over thousands of > years have a centimorgan measurement close to zero?  Do I have this right? A centimorgan is very much a term of traditional, pre-genomic genetics, so to a degree we are using rotary-phone terminology to describe smartphones, but there are, on average, with very large error bars, about 750,000 bp per cM in humans. If there was a hotspot, this would produce a smaller number of bp per centimorgan in the immediate proximity, while a region devoid of crossing-over would be part of a larger-than-average sized centimorgan (it would take more total sequence to generate a 1% chance of crossing over, since you would have this long stretch with no crossing over whatsoever). > With regard to Richard's Y-DNA, have STR-tests been done for enough > markers that one could do a global search to see what surnames pop up > among his closest matches?  I know that the noise to information ration > can be too large if not enough markers have been tested, but it seems > like it would be worth a shot. They did a 23-marker analysis, which is pretty superficial, and gave thm just one step beyond the basal haplogroup. G2 arose well before 7000 years ago, at which time is is found in Spain, France and Germany, in burials associated with the first agriculturalist populations that spread from Anatolia to largely displace the native hunter-gatherers. It was already highly divergent by that time, so it probably is quite ancient. The 23-marker testing done on Richard does not allow his subclade to be determined, so a surname analysis on this level would be largely uninformative. To say that Alans included Gs so Richard's came from the Alans is completely unsupportable in light of this history. For their specific results, see Supplementary Table 2, on p. 8 of their Data Supplement: https://images.nature.com/original/nature-assets/ncomms/2014/141202/ncomms6631/extref/ncomms6631-s1.pdf They have an ongoing project to do whole genome sequencing, from which further markers may be determined that would allow the subclade to be determined, but as far as I know, this information is not public yet, and until it is, such a surname study can only serve to exclude those who are not G2, but will not provide genealogically-relevant information. taf

    08/27/2017 06:54:18
    1. Re: Richard III DNA Investigation
    2. Darrel Hockley
    3. That would explain why no one else living today shows up as a Y-DNA match to Richard III in anything I have read about the matter. I had wondered about that in the light there seems to be a public push to find living relatives on his maternal side. Darrel Hockley From: taf <taf.medieval@gmail.com> To: gen-medieval@rootsweb.com Sent: Sunday, August 27, 2017 1:55 PM Subject: Re: Richard III DNA Investigation On Sunday, August 27, 2017 at 10:38:10 AM UTC-7, Stewart Baldwin wrote: > On 8/26/2017 4:03 PM, taf wrote: > > In fact, the smallest divisible segments are probably in the 10s to > > 100s of thousands of bases, which is what puts a definite limit on the > > number of generations over which autosomal is likely to be informative > > without a huge amount of luck (for every ancestor from 1600 from whom > > you have a detectable preserved block, you have many more ancestors > > from whom you inherit no DNA whatsoever). These blocks pass intact for > > an incredibly long time, the block that includes the gene determining > > the most common form of blue eyes is about 150,000bp long and to have > > passed largely intact for more than 12,000 years. This presents a > > problem on two sides - relatively close relative may not share the > > block at all. If two people do share the block, it shows they are > > related, but perhaps too distantly to be genealogically relevant. > > Doesn't the "centimorgan" (cM) measurement take this partially into > account?  If I understand the method correctly, one cM will sometimes > contain a huge number of base pairs, and sometimes a relatively small > number (if it is contained in a "hot spot").  So, wouldn't a long > segment that passed on intact (or essentially intact) over thousands of > years have a centimorgan measurement close to zero?  Do I have this right? A centimorgan is very much a term of traditional, pre-genomic genetics, so to a degree we are using rotary-phone terminology to describe smartphones, but there are, on average, with very large error bars, about 750,000 bp per cM in humans.  If there was a hotspot, this would produce a smaller number of bp per centimorgan in the immediate proximity, while a region devoid of crossing-over would be part of a larger-than-average sized centimorgan (it would take more total sequence to generate a 1% chance of crossing over, since you would have this long stretch with no crossing over whatsoever). > With regard to Richard's Y-DNA, have STR-tests been done for enough > markers that one could do a global search to see what surnames pop up > among his closest matches?  I know that the noise to information ration > can be too large if not enough markers have been tested, but it seems > like it would be worth a shot. They did a 23-marker analysis, which is pretty superficial, and gave thm just one step beyond the basal haplogroup.  G2 arose well before 7000 years ago, at which time is is found in Spain, France and Germany, in burials associated with the first agriculturalist populations that spread from Anatolia to largely displace the native hunter-gatherers.  It was already highly divergent by that time, so it probably is quite ancient.  The 23-marker testing done on Richard does not allow his subclade to be determined, so a surname analysis on this level would be largely uninformative.  To say that Alans included Gs so Richard's came from the Alans is completely unsupportable in light of this history. For their specific results, see Supplementary Table 2, on p. 8 of their Data Supplement: https://images.nature.com/original/nature-assets/ncomms/2014/141202/ncomms6631/extref/ncomms6631-s1.pdf They have an ongoing project to do whole genome sequencing, from which further markers may be determined that would allow the subclade to be determined, but as far as I know, this information is not public yet, and until it is, such a surname study can only serve to exclude those who are not G2, but will not provide genealogically-relevant information. taf ------------------------------- To unsubscribe from the list, please send an email to GEN-MEDIEVAL-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    08/27/2017 02:02:29
    1. Re: Richard III DNA Investigation
    2. taf
    3. On Sunday, August 27, 2017 at 12:54:20 PM UTC-7, taf wrote: > They have an ongoing project to do whole genome sequencing, from which > further markers may be determined that would allow the subclade to be > determined, For those holding out hope, let me add that I said 'may' intentionally. The next-generation sequencing approaches most commonly used would provide useful SNP data (places where a single base is different), but is abysmal at determining the number of repeats in STRs (and is prone to guess, and when it does, guess wrong). This has to do with the way the information is generated - it collects very short stretches of sequence and uses a computer to line them up, Since by their nature, an STR has the same sequence, repeated again and again, the computer doing the alignment doesn't know which set to align with. As an example, lets say you generated sequences that could be characterized as: ABCDEFG DEFGHIJ This is simple to align as ABCDEFGHIJ. However, if you have a repeat: ABCDEEEEEEE EEEEEEEFGHI there is no way to tell how many of the repeats you have, which E aligns with which: ABCDEEE....EEEFGHI The only way you can count what is in between is if you have a single sequencing read that spans all the way from one side to the other: DEEEEEEEEEFG. However, most genealogically-informative STR regions are longer than the read lengths typically generated by the sequencing reaction, so you will never get this information. Unless it has specifically been programmed not to do this, the computer doing the compiling may simply align one stretch of EEEEE with another and give you a sequence that has a definite number even though the raw data was ambiguous. Thus, I wouldn't hold out hope that whole-genome sequencing will resolve this question, and one should be very careful in accepting reported values unless these issues are specifically evaluated. The STR analysis that is typically performed is PCR-based, not sequencing based, and hence does not suffer from these limitations, but one has to separately evaluate each STR, hence the scaled costs for progressively more 'markers', each additional marker being another test that must be performed. There is nothing to stop them doing a 100-marker test on Richard's DNA - if it was preserved well enough to do a 23-marker test, it should be good enough to do any number. It is just a question of whether the research group would view this as a priority or not. (And one may be able to sway this decision on their part with a sizable financial contribution to their research program, but it is going to cost you a lot more than simple having FTDNA test a cheek swab.) taf

    08/28/2017 04:32:29
    1. Re: Richard III DNA Investigation
    2. Andrew Lancaster
    3. On Monday, August 28, 2017 at 7:32:31 PM UTC+2, taf wrote: > For those holding out hope, let me add that I said 'may' intentionally. The next-generation sequencing approaches most commonly used would provide useful SNP data (places where a single base is different), but is abysmal at determining the number of repeats in STRs (and is prone to guess, and when it does, guess wrong). But surely you do not need STRs if you have full sequencing? Even between a parent and a child there are likely to be some novel SNP mutations. For those who do not realize the difference, I will try to give a summary for our forum here. I imagine others might be able to suggest improvements on my definitions: 1. The current autosomal DNA tests only tests a large number of places where very common mutations are known to distinguish whole populations of people. On this basis all kinds analysis is made to work out if a whole series of specific values on mutations known to be physically near each other have been inherited together. ...So you try to make a tree of these blocks. But the blocks do not play nice, and keep splitting up differently. 2. The older cheaper STR tests looked for recent mutations, but deliberately looking at places where mutations are known to happen so often that they also often go backwards, or happen in many different people in the same generation. Nevertheless, you are often able to make crude family tree guesses, including recent and ancient generations, depending how many recruits you have. 3. Full sequencing would go way beyond both of the above, because it can let us zoom in on new and uncommon mutations happening in one place on a chromosome (Y or autosomal). Because these mutations are uncommon, real family trees will be able to be made. ...In fact this is exactly what has been done with Y DNA SNP mutations, except that in the past most tests have not been full sequencing, and have involved small numbers of people (compared to the number of genealogists doing genetic genealogy). This means a real male line family tree of all men really does exist, with very little room for doubt, but it mainly involves ancient branching. Most generations are missing. Most men living today have to share their branch end with lots of other men until more complete sequencing gives more data to define the more recent branching.

    08/28/2017 09:18:07
    1. Re: Richard III DNA Investigation
    2. taf
    3. On Monday, August 28, 2017 at 3:18:08 PM UTC-7, Andrew Lancaster wrote: > But surely you do not need STRs if you have full sequencing? Even between > a parent and a child there are likely to be some novel SNP mutations. On a purely technical level, this is correct, but having the Y genome will only be useful for genealogy by comparison to other people, and it is still well in the future when enough people will sequence their entire genomes for genealogical purposes to provide a sufficiently large pool that you are likely to find a 'match' with Richard III based on Y-genome sequence, while there may well already be enough people tested to find an STR match. taf

    08/28/2017 10:39:21