RootsWeb.com Mailing Lists
Total: 2/2
    1. Re: [DNA] Full Genomes Corporation tests
    2. Ann Turner via
    3. Does the final column in the VCF file start with three characters like these? 0/0 homozygous for the REF allele 0/1 heterozygous 1/1 homozygous for the ALT allele Ann Turner On Thu, Jan 7, 2016 at 4:12 AM, Iain Kennedy via <genealogy-dna@rootsweb.com > wrote: > I am playing around with what can be done with this data, I was > particularly interested to see how far I could get mocking up an > AncestryDNA type file from the data. However I think 2x is too low to get > far except as a proof of concept. I have loaded the dbSNP VCF into mySQL > but it doesn't seem to follow the normal VCF standard which is a shame (it > isn't using '.' in the ALT column when the calls were all ancestral). I > also had a go using mpileup and varscan overnight but this requires 8x > reads minimum: > > C:\genealogy\software>samtools mpileup -f human_g1k_v37.fasta GWK3W.bam | > java -jar VarScan.v2.3.7.jar pileup2snp > gwk3w_vs.vcf > [mpileup] 1 samples in 1 input files > <mpileup> Set max per-file depth to 8000 > Warning: No p-value threshold provided, so p-values will not be calculated > Min coverage: 8 > Min reads2: 2 > Min var freq: 0.01 > Min avg qual: 15 > P-value thresh: 0.99 > Input stream not ready, waiting for 5 seconds... > Reading input from STDIN > 2349372654 bases in pileup file > 8008795 met minimum coverage of 8x > 2905740 SNPs predicted > > I haven't loaded it into the db yet. > > The dbSNP VCF I refer to includes ancestral and derived calls and mine has > 107M rows. > > Iain > > > > > ------------------------------- > To unsubscribe from the list, please send an email to > GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without > the quotes in the subject and the body of the message >

    01/06/2016 10:15:35
    1. Re: [DNA] Full Genomes Corporation tests
    2. Iain Kennedy via
    3. Yes, plus the INFO string has fields summing ancestral and derived reads. I was just puzzled about the ALT field as the VCF standard states it should have a "." if not used ie no derived call. The BigY VCFs were like that. Iain From: dnacousins@gmail.com Date: Thu, 7 Jan 2016 05:15:35 -0500 Subject: Re: [DNA] Full Genomes Corporation tests To: ikennedy_msdn2@hotmail.com; genealogy-dna@rootsweb.com Does the final column in the VCF file start with three characters like these? 0/0 homozygous for the REF allele 0/1 heterozygous 1/1 homozygous for the ALT allele Ann Turner On Thu, Jan 7, 2016 at 4:12 AM, Iain Kennedy via <genealogy-dna@rootsweb.com> wrote: I am playing around with what can be done with this data, I was particularly interested to see how far I could get mocking up an AncestryDNA type file from the data. However I think 2x is too low to get far except as a proof of concept. I have loaded the dbSNP VCF into mySQL but it doesn't seem to follow the normal VCF standard which is a shame (it isn't using '.' in the ALT column when the calls were all ancestral). I also had a go using mpileup and varscan overnight but this requires 8x reads minimum: C:\genealogy\software>samtools mpileup -f human_g1k_v37.fasta GWK3W.bam | java -jar VarScan.v2.3.7.jar pileup2snp > gwk3w_vs.vcf [mpileup] 1 samples in 1 input files <mpileup> Set max per-file depth to 8000 Warning: No p-value threshold provided, so p-values will not be calculated Min coverage: 8 Min reads2: 2 Min var freq: 0.01 Min avg qual: 15 P-value thresh: 0.99 Input stream not ready, waiting for 5 seconds... Reading input from STDIN 2349372654 bases in pileup file 8008795 met minimum coverage of 8x 2905740 SNPs predicted I haven't loaded it into the db yet. The dbSNP VCF I refer to includes ancestral and derived calls and mine has 107M rows. Iain ------------------------------- To unsubscribe from the list, please send an email to GENEALOGY-DNA-request@rootsweb.com with the word 'unsubscribe' without the quotes in the subject and the body of the message

    01/07/2016 04:31:40