Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

GATK4 VariantsToTable unable to properly assign ANN field to multi-allelic variants

The input vcf record was:

chr19 9115341 rs2217657 C G,A 29868.6 PASS AC=70,1;AF=0.368,5.263e-03;AN=190;BaseQRankSum=-3.238e+00;DB;DP=3117;ExcessHet=9.1637;FS=1.757;InbreedingCoeff=-0.1235;MLEAC=69,1;MLEAF=0.363,5.263e-03;MQ=59.95;MQRankSum=0.00;PG=3,0,3,26,26,51;POSITIVE_TRAIN_SITE;QD=15.16;ReadPosRankSum=-7.980e-01;SOR=0.616;VQSLOD=9.05;culprit=MQRankSum;ANN=A|missense_variant|MODERATE|OR7G1|ENSG00000161807|transcript|ENST00000541538.1|protein_coding|1/1|c.423G>T|p.Trp141Cys|423/936|423/936|141/311||,G|missense_variant|MODERATE|OR7G1|ENSG00000161807|transcript|ENST00000541538.1|protein_coding|1/1|c.423G>C|p.Trp141Cys|423/936|423/936|141/311|| ...

Following command was run:
java -jar gatk-package-4.1.0.0-local.jar VariantsToTable --variant chr19.genotypeRefined.ann.recode.vcf --split-multi-allelic -F CHROM -F POS -F REF -F ALT -F ID -F TYPE -F TRANSITION -F FILTER -F HET -F HOM-REF -F HOM-VAR -F VAR -F ANN -F LOF -F NMD -GF GT -GF GQ --output ch19.genotypeRefined.ann.recode.table

The output table contains two entries associated with above variant:

chr19 9115341 C G rs2217657 SNP -1 PASS 51 34 10 61 A|missense_variant|MODERATE|OR7G1|ENSG00000161807|transcript|ENST00000541538.1|protein_coding|1/1|c.423G>T|p.Trp141Cys|423/936|423/936|141/311|| ...

chr19 9115341 C A rs2217657 SNP -1 PASS 51 34 10 61 G|missense_variant|MODERATE|OR7G1|ENSG00000161807|transcript|ENST00000541538.1|protein_coding|1/1|c.423G>C|p.Trp141Cys|423/936|423/936|141/311|| ...

Look at the ANN field. The annotation of C>G and C>A have been swapped.

Furthermore, there are entries where the ANN field has not been split, but simply copied to all alleles. For example,

chr19 3752876 A G rs8102086 SNP -1 PASS 50 14 31 81 C|missense_variant|MODERATE|APBA3|ENSG00000011132|transcript|ENST00000316757.3|protein_coding|7/11|c.1126T>G|p.Cys376Gly|1327/2075|1126/1728|376/575||,G|missense_variant|MODERATE|APBA3|ENSG00000011132|transcript|ENST00000316757.3|protein_coding|7/11|c.1126T>C|p.Cys376Arg|1327/2075|1126/1728|376/575||...,G|non_coding_transcript_exon_variant|MODIFIER|APBA3|ENSG00000011132|transcript|ENST00000592826.1|retained_intron|3/4|n.400T>C||||||...

chr19 3752876 A C rs8102086 SNP -1 PASS 50 14 31 81 C|missense_variant|MODERATE|APBA3|ENSG00000011132|transcript|ENST00000316757.3|protein_coding|7/11|c.1126T>G|p.Cys376Gly|1327/2075|1126/1728|376/575||,G|missense_variant|MODERATE|APBA3|ENSG00000011132|transcript|ENST00000316757.3|protein_coding|7/11|c.1126T>C|p.Cys376Arg|1327/2075|1126/1728|376/575||...,G|non_coding_transcript_exon_variant|MODIFIER|APBA3|ENSG00000011132|transcript|ENST00000592826.1|retained_intron|3/4|n.400T>C||||||...

Is there an issue with the command or am I misinterpreting the observation?
Thanks
Srikant

Best Answers

  • SkyWarriorSkyWarrior Turkey ✭✭✭
    Accepted Answer

    Can you try splitting multiallelics before annotation step? This will probably solve your problem.

  • SkyWarriorSkyWarrior Turkey ✭✭✭
    Accepted Answer

    Splitting multiallelics won't produce a table. You can use gatk trimandleftalignindels, bcftools norm or vt decompose for this. Any of these tools will produce a vcf file that has all multiallelics split into biallelic snps or indels.

Answers

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭
    Accepted Answer

    Can you try splitting multiallelics before annotation step? This will probably solve your problem.

  • srikant_vermasrikant_verma IndiaMember

    Thanks @SkyWarrior for your kind reply. However, I am not sure if your suggestion will work because If I split the multiallelic sites before annotation, I will end up getting a tab-separated table which I won't be able to feed to SnpEff for annotation. Is there a way to convert such table to a VCF?
    Anyway, do you think there is a bug/limitation in VariantsToTable tool which is creating the problem, or I missed something obvious while running it?

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭
    Accepted Answer

    Splitting multiallelics won't produce a table. You can use gatk trimandleftalignindels, bcftools norm or vt decompose for this. Any of these tools will produce a vcf file that has all multiallelics split into biallelic snps or indels.

  • srikant_vermasrikant_verma IndiaMember

    Thanks a lot @SkyWarrior ! I used GATK's LeftAlignAndTrimVariants to split the multi-allelic sites into bi-allelic sites, followed by annotation using SnpEff. Your suggestion solved the issue. Thanks!!
    However, It will be great if GATK's VariantsToTable can handle this. May be it is already doing that but I am not aware of parameters.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    @SkyWarrior Thank you for your input and as always it is a great resource for our GATK community!

    @srikant_verma I am glad the suggestion helped you solve the issue. And you are right, GATK's VariantsToTable should be able to handle this, let me check with the dev team and get back to you.

  • srikant_vermasrikant_verma IndiaMember

    Thanks @bhanuGandham for your reply!
    The ANN fields have been added by running SnpEff. I am not sure if I could have run the SnpEff command differently to avoid the first issue.
    For the second case, please find the attachment for your reference.

  • srikant_vermasrikant_verma IndiaMember

    Thanks @bhanuGandham for your inputs!

Sign In or Register to comment.