If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Accurate ref/alt read counts for DNPs

mrooneymrooney Cambridge, MAMember


I am using HaplotypeCaller in "genotype_given_alleles" mode in order to obtain REF and ALT read counts for candidate variants (using the AD field). This seems to work fine for SNPs and indels; however, I seem to have trouble with DNPs (e.g. REF=CC,ALT=AT), which always get assigned a variant read count of zero (e.g. "GT:AD:DP:GQ:PL 0/0:331,0:331:99:0,1072,2147483647". When I look at the HC-generated bam in a viewer, the variant reads are clearly present in abundance. So the read stats seem to be wrong.

Is this expected behavior? If not, could you recommends steps/checks to figure this out?

I have attached my HC parameters, a list of some DNPs that were missed, and a screen shot of the first variant.


Best Answer


  • mrooneymrooney Cambridge, MAMember
    I forgot to mention that I am using GATK 3.5
  • mrooneymrooney Cambridge, MAMember

    And here is the command line: java -Xmx18000M -jar /opt/GenomeAnalysisTK_3.5-0-g36282e4.jar --analysis_type HaplotypeCaller --out HC.vcf -bamout output_HC.bam --bamWriterType ALL_POSSIBLE_HAPLOTYPES --standard_min_confidence_threshold_for_emitting 20 --standard_min_confidence_threshold_for_calling 20 --reference_sequence human.fasta --input_file input.bam --dontUseSoftClippedBases --dontTrimActiveRegions --intervals alleles.vcf --interval_padding 500 --genotyping_mode GENOTYPE_GIVEN_ALLELES --gatk_key gatk.key --forceActive --disableOptimizations --dbsnp sbsnp.vcf --alleles alleles.vcf

  • mrooneymrooney Cambridge, MAMember

    I came across a post indicating that HaplotypeCaller cannot call DNPs (rather it would call two SNPs, which could be phased with other tools). Should I take this to mean that HaplotypeCaller also cannot handle DNPs in genotype_given_alleles mode? If so, does this imply that HaplotypeCaller will have issues with other complex variants (e.g. REF=CC,ALT=A) that cannot be represented as a simple indel or SNP?

  • mrooneymrooney Cambridge, MAMember
Sign In or Register to comment.