Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

A potential bug report

dongmeidongmei Mass General HospitalMember

Dear GATK team,

I recently received a vcf file for an exome sequencing project from Broad. I found some samples were called as HET although all DP came from alternative allele. For example:

On rs1129808 (ref:A; alt:C) , I found 0/1:0,4:4:99:333,0,631 for sample 001

On chr21:33733468 (ref:A; alt:AAAAGAAAGAAAGAAAGAAAG,AAAAGAAAGAAAGAAAG,AAAAG,AAAAGAAAGAAAG,AAAAGAAAG,AAAGAAAGAAAGAAAG) , I found 0/1:0,10,0,0,0,0,0:10:5:367,0,5,400,35,434,400,35,434,434,400,35,434,434,434,400,35,434,434,434,434,400,35,434,434,434,434,434 for sample 002

On chr3:195506451 (ref:AGGGGTGGCGTGACCTGTGGATACTGAGGAAGTGTCGGTGACAGGAAGG; alt:A), I found 0/1:0,54:54:99:4354,0,309 for sample 003

This kind of problem happens to both SNV and Indels. Is this a result of a bug, or is there anything I missed? I can provide more information if needed. Thanks a lot!

Dongmei

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @dongmei‌

    Hi Dongmei,

    Are you able to provide us with a snippet of the bam file that is causing the error?

    If so, instructions on how to do so are here: http://gatkforums.broadinstitute.org/discussion/1894/how-do-i-submit-a-detailed-bug-report

    -Sheila

  • dongmeidongmei Mass General HospitalMember

    HI Sheila,

    Thank you for your quick response! Attached please find the snippets of the bam files for the three incidents reported in my previous post. This is the first time I created such snippets, so please let me know if this is what you need.

    Many thanks,

    Dongmei

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @dongmei‌

    Hi Dongmei,

    There is a formatting issue with the files you sent. They are not recognized as valid sam or bam files.

    Do you know what version of GATK was used? This seems like a bug that has been fixed recently. Do you know which variant caller was used?

    Can you post the command line that was used?

    Thanks,
    Sheila

  • dongmeidongmei Mass General HospitalMember

    Hi Sheila,

    The vcf file was generated a month ago (09/04/2014). snippet_chr6.bam was created from an old bam file generated on 09/27/2013, snippet_chr21.bam was created from a bam file generated on 05/08/2014, and snippet_chr3.bam was created from a bam file generated on 05/10/2014.

    I'm not sure which caller or what command line was used. Kathleen Tibbetts ([email protected]) and Charlotte Tolonen ([email protected]) did the work for us, and I'm waiting for the information from them.

    Which step do you think the problem has occurred, generating the bam files or creating the vcf file? By the way, I don't know if it's relevant, but the vcf file is a joint called result on samples from 3 different platforms.

    Thanks,

    Dongmei

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi @dongmei,

    We think this might be due to a bug that affected an earlier version of GATK -- once you know the exact version that was used we may be able to confirm. If so the good news is that it's not a very serious problem; the variant call in itself is good, it's just the sample genotype that is problematic. I think it should be possible to fix these without recalling everything by running the RegenotypeVariants -- but I have to check since this is not the primary purpose of that tool.

    Can you just confirm the version once you have it, and post an IGV screenshot of the region to confirm that the reported allele depths are correct?

  • dongmeidongmei Mass General HospitalMember

    Hi Geraldine,

    The exact version is The Genome Analysis Toolkit (GATK) v3.1-144-g00f68a3, Compiled 2014/05/14 13:10:51.

    Attached is the IGV screenshot for MH0153493_006_006 on chr3:195506451 ((ref:AGGGGTGGCGTGACCTGTGGATACTGAGGAAGTGTCGGTGACAGGAAGG; alt:A), where I found 0/1:0,54:54:99:4354,0,309 for sample MH0153493_006_006, and for TSC059_03_012_012 on chr21:33733468 (ref:A; alt: AAAAGAAAGAAAGAAAGAAAG,AAAAGAAAGAAAGAAAG,AAAAG,AAAAGAAAGAAAG,AAAAGAAAG,AAAGAAAGAAAGAAAG) , I found 0/1:0,10,0,0,0,0,0:10:5:367,0,5,400,35,434,400,35,434,434,400,35,434,434,434,400,35,434,434,434,434,400,35,434,434,434,434,434 for sample TSC059_03_012_012.

    Thanks,

    Dongmei

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    I'm not seeing your chr3 variant but that's probably because it is only found after HC's realignment step. You can see the results of that by adding -bamout to your command line when you call the variants.

    But I can see your insertion in chr21, which definitely looks more heterozygous than homozygous-variant.

    So it looks like the genotype calls are correct but the reported AD values are not. We have had a few bugs in 3.2 and 3.2 that caused inaccurate AD values which could explain this. The bugs have all been fixed in the development version.

    Depending on your needs you can either proceed with these calls as they are, or you could do a round of re-calling on the affected sites with the latest nightly build to get the correct AD values. You can do this by running HaplotypeCaller on the bam files, using the -L argument to pass in a vcf with the sites. Be sure to use the -bamout argument so you can visualize any changes made by HC's realignment process.

    Let us know if you need any help with this.

  • dongmeidongmei Mass General HospitalMember

    Thanks a lot for your suggestion!

Sign In or Register to comment.