We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

GATK4 Mutect2 variants IDs not shown

Hi everyone!

I am a beginner using GATK, so bear with me please. Also, I am sorry this is duplicated in Biostars, I believe here is more appropriate.

I am trying to do my variant calling with GATK4 new Mutect2 (not MuTect2), using as --germline-resource the af-only-gnomad.hg38.vcf.gz from the GATK's bundle. The command is pretty much the same as seen in the tutorial: https://software.broadinstitute.org/gatk/documentation/article?id=11136, only updated to allow it to be used with WES.

My problem is that after successfully completing the variant calling, none of the variants have ID information, although when I compared it to a MuTect2 result several did, using --dbsnp which is not available in Mutect2.
Basically, all of the IDs are a "."

I checked the uncompressed gnomad.hg38.vcf.gz manually and the ID info as rsXXXXXX are there.

Any ideas on why the variant calling is not collecting the ID info from the germline resource to populate the ID field? Thank you!



Best Answers


  • AdelaideRAdelaideR Member admin

    Hi @daianagan

    I was curious if you tried to ValidateVariants on the gnomad vcf file? I have a feeling there might be an issue with that file, even though it looks okay at first glance.

    Also, if you could post the command, that would be helpful.

  • daianagandaianagan Member
    edited March 2019

    Hi @AdelaideR , I am sorry for the belated response.
    I was able to ran ValidateVariants with no errors on the gnomad file. This is the gnomad file downloaded from the GATKs bundle together with its index.

    The command I used:

    gatk \
    Mutect2 \
    -bamout \
    tumor_sample.bqsr.mutect.bam \
    -O \
    tumor_sample.bqsr.somatic.vcf \
    --af-of-alleles-not-in-resource \
    4e-06 \
    --germline-resource \
    af-only-gnomad.hg38.vcf \
    -I \
    normal_sample.bqsr.bam \
    -I \
    tumor_sample.bqsr.bam \
    -normal \
    normal_sample \
    -R \
    Homo_sapiens_assembly38.fasta \
    -tumor \
    tumor_sample \
    -L \

    Thank you!

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @daianagan

    Would you please post a few records of the variants you are talking about please.

  • daianagandaianagan Member
    edited March 2019

    Hello @bhanuGandham,
    There is no specific variants, as none of them have the IDs (all have . in the ID field), but I am showing two examples:

    Variant in position CHR1:1232836 has an ID, rs776909065.

    This variant corresponds to rs780138157

    Hope it helps,

    PS: I am also using the GATK Homo_sapiens_assembly38.fasta as reference from the bundle.

  • daianagandaianagan Member

    Thank you for the response @bhanuGandham !

    Just to clarify, I did not mean to do an annotation with the germline resource, but to be able to achieve the same result as I would by using --dbsnp in MuTect2 from GATK3 (that does not exist for Mutect2 from GATK4), which is populating the ID field with the rs for the known variants...

    As the germline resource was the only source of know variants it was my guess it would use those to populate, although some somatic variants not present in GNOMAD wouldn't be seen.

    I understand now from what you say that --dbsnp would be needed, and that this is not supported in Mutect2 for GATK4, am I right?

    Do you know if you plan to add this functionality in the near future? I saw it was available for the beta version, but not in the current one. I believe it is very useful and I (and probably several others) would really appreciate it!
    Thank you for your help!

Sign In or Register to comment.