Missing depth (dp) after HaplotypeCaller

HoppmannHoppmann FreiburgMember
edited May 2015 in Ask the GATK team

Hi,

currently I'm running GATK version 3.3.0 and run in the problem, that quite often I don't get a value for DP after the HaplotypeCaller. There actually is nothing (just a dot) in the VCF file.

chr11 57569573 . A ACC 1640.8 VQSRTrancheINDEL99.90to100.00 . GT:AD:DP:GQ:PGT:PID:PL 0/1:57,16:.:99:0|1:57569569_CAGG_C:480,0,7262
chr11 57569575 . A AT 1686.05 VQSRTrancheINDEL99.90to100.00 . GT:AD:DP:GQ:PGT:PID:PL 0/1:60,16:.:99:0|1:57569569_CAGG_C:492,0,7092

What woks is to annotate afterwards using the VariantAnnotator. But in the documentation it states, that the VariantAnnotator defines the depth by the total amount of reads at that specific position. In contrast the DP after HaplotypeCaller should be the number of informative reads for that position. I can find that difference when I check for the position in GEMINI prior and after Annotation with Variant annotator.

Patient 1 (after reannotation using VariantAnnotator)
Position depth variant
57563990 75 C/C
57569568 81 CAGG/C
57569572 78 A/ACC
57569574 81 A/AT

Patient 2 (after reannotation using VariantAnnotator)
Postition depth variant
57563990 73 C/T
57569568 104 CAGG/CAGG
57569572 106 A/A
57569574 106 A/A

After HaplotypeCaller
Postion patient1 patient2 ...(other patients)
57563990 48 -1 -1 -1 -1 -1 C/C C/T C/T C/T C/T C/T
57569568 -1 66 52 -1 56 60 CAGG/C CAGG/CAGG CAGG/CAGG CAGG/C CAGG/CAGG CAGG/CAGG
57569572 -1 101 88 -1 101 77 A/ACC A/A A/A A/ACC A/A A/A
57569574 -1 97 87 -1 94 78 A/AT A/A A/A A/AT A/A A/A
57576040 -1 -1 -1 -1 -1 8 ./. ./. ./. ./. ./. C/C
57578937 -1 -1 -1 -1 -1 -1 ./. ./. ./. ./. ./. TTG/T
57578941 -1 -1 -1 -1 -1 -1 ./. ./. ./. ./. ./. C/CTG
57578942 -1 -1 -1 -1 -1 -1 ./. ./. ./. ./. ./. C/G

E.g. patient 2 has in the position 57569568 66 informative reads, but 104 reads in total. To check how valid a Variant is, I think it is way more interesting to know the number of informative reads.

In this specific case we checked for those mutation (57569572 and 57569574) via sanger and didn't find anything. If we would know that only 2 or 3 reads would be informative we might have gone to another variant instead.

Am I doing something wrong, or is there a bug in the software?

One example how I call the HaplotypeCaller.

java -Xmx10g -jar /data/ngs/bin/GATK/3.3-0/GenomeAnalysisTK.jar -l INFO -et NO_ET \
-T HaplotypeCaller \
-R /data/ngs/programs/bundle_2.8/ucsc.hg19.fasta \
-D /data/ngs/programs/bundle_2.8/dbsnp_138.hg19.vcf \
-L /data/ngs/exome_targets/SureSelect_V4_S03723314_Regions.bed \
-ERC GVCF \
-variant_index_type LINEAR \
-variant_index_parameter 128000 \
-stand_call_conf 30 \
-stand_emit_conf 30 \
-A Coverage -A AlleleBalance \
-A QualByDepth -A HaplotypeScore \
-nct 4 \
-I out/06-BQSR/21_383_1.bam \
-o out/08-variant-calling/21_383_1/21_383_1-raw.vcf \
-log out/08-variant-calling/21_383_1/21_383_1-raw_calling.log

Thanks in advance,

Anselm

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Hoppmann
    Hi Anselm,

    Can you confirm you are looking at the final vcf and not at the intermediate GVCFs? What is in your input bam file out/06-BQSR/21_383_1.bam?

    Thanks,
    Sheila

  • HoppmannHoppmann FreiburgMember

    Hi,

    thank you for the quick response. We had a holliday yesterday. Else I would have responded earlier too. After the command I've allready posted I also run the GenotypeGVCFs as stated in the best practise.

    java -Xmx10g -jar /data/ngs/bin/GATK/3.3-0/GenomeAnalysisTK.jar -l INFO -K /data/ngs/bin/GATK/3.3-0/yong.li_uniklinik-freiburg.de.key -et NO_ET \
    -T GenotypeGVCFs \
    -R /data/ngs/programs/bundle_2.8/ucsc.hg19.fasta \
    -D /data/ngs/programs/bundle_2.8/dbsnp_138.hg19.vcf \
    -stand_call_conf 30 \
    -stand_emit_conf 30 \
    -nt 12 \
    -V out/08-variant-calling/21_383_1/21_383_1-raw.vcf \
    -V out/08-variant-calling/21_383_2/21_383_2-raw.vcf \
    -V out/08-variant-calling/21_383_3/21_383_3-raw.vcf \
    -V out/08-variant-calling/21_385_1/21_385_1-raw.vcf \
    -V out/08-variant-calling/21_385_2/21_385_2-raw.vcf \
    -V out/08-variant-calling/21_385_3/21_385_3-raw.vcf \
    -V out/08-variant-calling/21_428_1/21_428_1-raw.vcf \
    -V out/08-variant-calling/21_428_2/21_428_2-raw.vcf \
    -V out/08-variant-calling/21_428_3/21_428_3-raw.vcf \
    -V out/08-variant-calling/21_428_4/21_428_4-raw.vcf \
    -V out/08-variant-calling/21_428_6/21_428_6-raw.vcf \
    -V out/08-variant-calling/229_1/229_1-raw.vcf \
    -V out/08-variant-calling/229_2/229_2-raw.vcf \
    -V out/08-variant-calling/229_3/229_3-raw.vcf \
    -V out/08-variant-calling/229_4/229_4-raw.vcf \
    -V out/08-variant-calling/30_11_1/30_11_1-raw.vcf \
    -o out/08-variant-calling/combined/combined_raw.vcf \
    -log out/08-variant-calling/GenotypeGVCF.log

    Following I do a VQSR step followed by extraction of the families and/or individuals. Those are the vcf-files I was refering to.

    Right. My input bam-file ist the "out/06-BQSR/21_383_1.bam". It is the output of a prior BQRS run.

    Attatched you can find a log-file of my script in the log file, I stored all commands executed by the script,

    Best wishes,

    Anselm

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    @Hoppmann Can you please do a test run on one of those sites (where you don't get DP annotated), without specifying any annotations? Just letting the tool apply its internal defaults. I think we've seen a case recently when specifying a particular annotation disabled some others and I'd like to check whether this might be the same bug.

    It would also be useful to try running the test command without multithreading (remove -nct), and also using the latest version (3.4) to check whether the problem still occurs after the update.

    Finally, as an FYI, note that you don't need to specify the confidence thresholds when running HC in -ERC GVCF mode as they will be ignored (values set to 0 internally). When running the GVCF workflow, those thresholds are only used by GenotypeGVCFs.

  • HoppmannHoppmann FreiburgMember
    edited May 2015

    Hi,

    thanks for the respnse. I've testet your suggestions with excluding specific annotaions. And it seems to work.

    chr11 57569572 GT:AD:DP:GQ:PGT:PID:PL 0/1:57,16:73:99:0|1:57569569_CAGG_C:480,0,7262 0/1:102,20:122:99:0|1:57569569_CAGG_C:489,0,12320
    chr11 57569574 GT:AD:DP:GQ:PGT:PID:PL 0/1:60,16:76:99:0|1:57569569_CAGG_C:492,0,7092 0/1:101,20:121:99:0|1:57569569_CAGG_C:495,0,12227

    Those are the extracts of the two patients without data from above for two positions.

    I'm running my script again from that point including all downstream to see if there are problems with upcoming steps like VEP or GEMINI. I'll let you know what happens. I'm also going to do a run just excluding "-nct" and let you know if anything happens there.

    Anselm

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Cool, glad to hear it. Can you tell me exactly which annotation exclusion did the trick? We'll want to investigate this buggy behavior in more detail.

  • HoppmannHoppmann FreiburgMember

    Hi,

    I run a few tests by just choosing one of the annotations and excluding the rest to see what happens. And it seems that only when I exclude all it works correctly. Following the output from gemini:

    gene chrom start gt_depths.21_428_1-AlleleBalance gt_depths.21_428_1-HaplotypeScore gt_depths.21_428_1-QualByDepth gt_depths.21_428_1-all gt_depths.21_428_1-coverage
    CTNND1 chr11 57569568 -1 -1 -1 77 -1
    CTNND1 chr11 57569572 -1 -1 -1 73 -1
    CTNND1 chr11 57569574 -1 -1 -1 76 -1

    the name states which option I had chosen (all means all excluded). Attatched you can find the commands I used to run the program.

    I also tried what happens when I exclude -nct. That dosen't seem to change anything.

    I hope this helps to find the bug. If you like another test tell me.

    Thanks again,
    Anselm

    Issue · Github
    by Geraldine_VdAuwera

    Issue Number
    1003
    State
    closed
    Last Updated
    Closed By
    vdauwera
  • HoppmannHoppmann FreiburgMember

    Attatchment didn't work previously. But now.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Thanks @Hoppmann, this will help is fix it.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    edited June 2015

    Hi @Hoppmann, it seems that this has already been reported, albeit with other requested annotations. You can monitor the status of this issue in the tracker here: https://www.broadinstitute.org/gatk/guide/issue-tracker

Sign In or Register to comment.