Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

All ALT fields are the same after GenomicsDBImport and GenotypeGVCFs

bdemareebdemaree San Francisco, CAMember
Hi GATK team,

I'm running GATK to perform targeted single-cell genotyping, where each sample in the output VCF is a single cell (10-20k samples total per VCF). I've noticed something strange in the final VCF: for all cells genotyped as WT for a given variant, all ALT allele depths are the same.

A snippet of the VCF is provided below as an example. For example, in the first sample, all ALTs have a depth of 6. Furthermore, the DP is also always the sum of the REF and first ALT allele depths (suggesting all other ALTs should probably be 0). In the second site (chr1:115256518), there is a HET call that has the correct depths listed, so this seems to only be an issue with the 0/0 calls.

chr1 115256516 . A G,T,*,C 32261.26 . AC=29,8,1,2;AF=1.566e-03,4.319e-04,5.399e-05,1.080e-04;AN=18522;BaseQRankSum=0.282;DP=4450248;ExcessHet=3.1936;FS=0.000;InbreedingCoeff=0.1634;MLEAC=28,8,1,2;MLEAF=1.512e-03,4.319e-04,5.399e-05,1.080e-04;MQ=41.96;MQRankSum=0.00;QD=2.19;ReadPosRankSum=0.00;SOR=0.291 GT:AD:DP:GQ:PL 0/0:717,6,6,6,6:723:99:0,120,1800,120,1800,1800,120,1800,1800,1800,120,1800,1800,1800,1800 0/0:841,8,8,8,8:849:99:0,120,1800,120,1800,1800,120,1800,1800,1800,120,1800,1800,1800,1800 0/0:292,1,1,1,1:293:99:0,120,1800,120,1800,1800,120,1800,1800,1800,120,1800,1800,1800,18000/0:1034,9,9,9,9:1043:99:0,120,1800,120,1800,1800,120,1800,1800,1800,120,1800,1800,1800,1800 0/0:134,0,0,0,0:134:99:0,120,1800,120,1800,1800,120,1800,1800,1800,120,1800,1800,1800,1800 0/0:130,0,0,0,0:130:99:0,120,1800,120,1800,1800,120,1800,1800,1800,120,1800,1800,1800,1800
chr1 115256518 . T C,A 33446.22 . AC=19,11;AF=1.026e-03,5.939e-04;AN=18522;BaseQRankSum=-1.180e+00;DP=4451040;ExcessHet=3.1125;FS=0.000;InbreedingCoeff=0.1986;MLEAC=19,11;MLEAF=1.026e-03,5.939e-04;MQ=41.96;MQRankSum=0.066;QD=2.08;ReadPosRankSum=0.023;SOR=0.021 GT:AD:DP:GQ:PL 0/0:722,1,1:723:99:0,120,1800,120,1800,1800 0/0:839,10,10:849:99:0,120,1800,120,1800,1800 ... 0/1:692,85,1:778:99:965,0,26646,3041,26933,30335 0/0:658,4,4:662:99:0,120,1800,120,1800,1800

In terms of the pipeline, I'm using HaplotypeCaller, GenomicsDBImport, and GenotypeGVCFs all from GATK I've observed the same behavior in as well. Strangely, using CombineGVCFs (which is very slow and requires iterative merging) does not produce the repeated ALT depths.

Here are the exact commands used for each of the three programs (I'm copying from my Python script so the string formatting is there):

'gatk HaplotypeCaller -R %s -I %s -O %s -L %s ' \
'--emit-ref-confidence BP_RESOLUTION ' \
'--verbosity ERROR ' \
'--native-pair-hmm-threads 1 ' \
'--max-alternate-alleles 2 ' \
'--max-reads-per-alignment-start 0 ' \

'gatk --java-options "-Xmx4g" GenomicsDBImport ' \
'--genomicsdb-workspace-path %s ' \
'--batch-size 50 ' \
'--reader-threads 2 ' \
'--validate-sample-name-map true ' \
'-L %s ' \
'--sample-name-map %s

'gatk --java-options "-Xmx4g" GenotypeGVCFs ' \
'-V %s ' \
'-R %s ' \
'-L %s ' \
'-D %s ' \
'-O %s ' \

I couldn't find any similar issues on the forum. I am fairly sure it's an issue with the GenomicsDB, given I have no issues when using CombineGVCFs instead (but, I could be wrong). Any ideas on what might be going on? Thanks!


Sign In or Register to comment.