Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

A question about using GenomeAnalysisTK to deal with ancient DNA data

Hi,Team,
I want to call variants for some ancient samples using HaplotypeCaller.
My command is as follows:
java -jar GenomeAnalysisTK.jar -R myref.fa -T HaplotypeCaller -ERC BP_RESOLUTION -I 1.bam -L site.list -o 1.genotype.g.vcf.gz
java -jar GenomeAnalysisTK.jar -R myref.fa -T HaplotypeCaller -ERC BP_RESOLUTION -I 2.bam -L site.list -o 2.genotype.g.vcf.gz
java -jar GenomeAnalysisTK.jar -R myref.fa -T HaplotypeCaller -ERC BP_RESOLUTION -I 3.bam -L site.list -o 3.genotype.g.vcf.gz
java -jar GenomeAnalysisTK.jar -R myref.fa -T CombineGVCFs --variant 1.genotype.g.vcf.gz --variant 2.genotype.g.vcf.gz --variant 3.genotype.g.vcf.gz -o merge.g.vcf.gz
java -jar GenomeAnalysisTK.jar -R myref.fa -T GenotypeGVCFs --variant merge.g.vcf.gz --includeNonVariantSites -stand_call_conf 30 -stand_emit_conf 30 -o merge.vcf.gz
In the 3.genotype.g.vcf.gz ,a site is :
1 72710 . G C, 2.99 . DP=1;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;RAW_MQ=3600.00 GT:AD:DP:GQ:PL:SB 1/1:0,1,0:1:3:25,3,0,25,3,25:0,0,1,0
In the merge.g.vcf.gz ,the site is :
1 72710 . G C, . . DP=1;ExcessHet=3.01;RAW_MQ=3600.00 GT:AD:DP:GQ:PL:SB ./.:0,0,0:0:0:0,0,0,0,0,0 ./.:0,0,0:0:0:0,0,0,0,0,0 ./.:0,1,0:1:3:25,3,0,25,3,25:0,0,1,0
But in the merge.vcf.gz ,the site is :
1 72710 . G C . . AC=0;AF=0.00;AN=2;DP=1;ExcessHet=3.01;MQ=60.00 GT:AD:DP:RGQ ./.:0,0:0:0 ./.:0,0:0:0 0/0:0,1:1:3
Why the site with low QUAL changes from 1/1 to 0/0 finally but doesn't delete it ?
What kind of impact is the result of this for me ?
Because my variantsite is few,every site is important.
Could you suggest how to fix this problem?
Thanks!
best,
KuroKami

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    You simply do not have enough coverage at this site to make useable calls. Here the program is still emitting the site because of the no-calls in two samples (by default we want to see where the program was uncertain) but in any case you cannot do anything with this kind of site. You can filter it out in the next step.
Sign In or Register to comment.