My whole-genome sequence shows me to be heterozygous for the T to A SNP at location 12854090 of chromosome 1, known in dbSNP as rs79698223. This SNP purports to cause a change of codon from TTG (coding for leucine) to TAG (a stop codon) in exon 3 of the PRAMEF1 gene. This would be a potentially serious mutation in that the new stop codon would disrupt synthesis of the PRAMEF1 protein. However, looking at my pile-up reads I see that this apparent SNP is accompanied by 4 other SNPs in the same immediate vicinity, and a given read either has all of these or none of them. Also, the ExAC database shows that almost everyone is heterozygous for this SNP, which would not be expected. Blasting the short sequence containing the 5 SNPs against the NCBI database shows that it is a perfect match for a sequence from 13030016 to 13030089 of the Build 38 human sequence. This would correspond approximately to 13062773-13062846 of the Build 37 human sequence. The problem is that this part of the Build 37 sequence shows only N's for the reference sequence. Thus, my whole-genome data, as well as the ExAC data, which are being compared with the Build 37 sequence instead of Build 38, cannot be mapped to the proper location since Build 37 does not show this region. The data is instead being mapped to the closest region shown in Build 37, which is incorrect. rs79698223 is therefore, as I see it, no longer "operable". Obed
I see that SNP rs78738981 at Build 37 location 12919574 (Build 38 location 12859719) of chr 1 is also artifactual for the same reason as for rs79698223. Both of these purportedly T to A mutations should really be mapped to Build 38 location 13030073 of chr 1, where A is the reference allele in an intergenic region. I wonder how long it will take dbSNP to realize this. On Tue, Oct 13, 2015 at 11:25 AM, Obed W Odom <owodom@utexas.edu> wrote: > My whole-genome sequence shows me to be heterozygous for the T to A SNP > at location 12854090 of chromosome 1, known in dbSNP as rs79698223. This > SNP purports to cause a change of codon from TTG (coding for leucine) to > TAG (a stop codon) in exon 3 of the PRAMEF1 gene. > > This would be a potentially serious mutation in that the new stop codon > would disrupt synthesis of the PRAMEF1 protein. However, looking at my > pile-up reads I see that this apparent SNP is accompanied by 4 other SNPs > in the same immediate vicinity, and a given read either has all of these or > none of them. Also, the ExAC database shows that almost everyone is > heterozygous for this SNP, which would not be expected. > > Blasting the short sequence containing the 5 SNPs against the NCBI > database shows that it is a perfect match for a sequence from 13030016 to > 13030089 of the Build 38 human sequence. This would correspond > approximately to 13062773-13062846 of the Build 37 human sequence. The > problem is that this part of the Build 37 sequence shows only N's for the > reference sequence. > > Thus, my whole-genome data, as well as the ExAC data, which are being > compared with the Build 37 sequence instead of Build 38, cannot be mapped > to the proper location since Build 37 does not show this region. The data > is instead being mapped to the closest region shown in Build 37, which is > incorrect. rs79698223 is therefore, as I see it, no longer "operable". > > Obed >