Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

VariantsToTable returns multi-allelic AD and PL as [[[email protected]]

Hi Team,

When asking VariantsToTable to return `-GF AD` and `-GF PL`, these fields are represented in the outfile as '[[[email protected]]', '[[[email protected]]', etc. This appears to be limited to multi-allelic variants, and happens with versions 3.3-0, 3.3-3, 3.1-1, and 2.7-1.

Is this expected behaviour, and if yes, is there a way to 'decode' [[[email protected]]?

Many thanks in advance, and thanks for the great tools!

Tagged:

Best Answer

Answers

  • KlausNZKlausNZ Member ✭✭

    oops, replace 3.3-3 wth 3.2-2, please

  • SheilaSheila Broad InstituteMember, Broadie admin

    @KlausNZ‌

    Hi,

    I just tried this, and I did not get the error you are getting. Can you please post your command line?

    Thanks,
    Sheila

  • SheilaSheila Broad InstituteMember, Broadie admin

    @KlausNZ‌

    Hi again,

    I tested on 2 variant alleles. Does this happen only when there are more than 2 variant alleles?

    -Sheila

  • KlausNZKlausNZ Member ✭✭

    Hi Shelia,
    Sorry for the delay - I have been paring back the rather long command and test it on vcfs with fewer samples. The problem still occurs, but I noticed that it only happens with the --splitMultiAllelic option (wanting the counts for split alleles is what I'm after, unfortunately). The commands are (with/without --splitMultiAllelic)
    -R $GATKREF \ -T VariantsToTable \ --splitMultiAllelic \ --allowMissingData \ --variant t.vcf \ --out t.tab \ -F CHROM \ -F POS \ -F REF \ -F ALT \ -F FILTER \ -F QUAL \ -GF AD \ -GF PL \ -GF DP \ -GF GQ \ -GF GT \ -F AC \ -F AF \ -F AN \ -F DB \ -F DP \ -F DS \ -F FS
    The vcf records are (I cut out the FILTER field here to aid visibility)
    #CHROM POS ID REF ALT QUAL FILTER FORMAT AA0243Z 1 1267325 rs200330269 G GC 1746.26 PASS GT:AD:DP:GQ:PL 0/0:38,0:38:93:0,93,1395 1 1684347 . CCCT CCCTCCT,C,CCCTCCTCCT 106993.14 PASS GT:AD:DP:GQ:PL 0/0:33,0,0,0:33:99:0,99,923,99,923,923,99,923,923,923 1 1850627 . CAGCGGCAGG C 5063.71 VQSRTrancheINDEL99.00to99.90 GT:AD:DP:GQ:PL ./.:0,0:0

    Output with --splitMultiAllelic:
    CHROM POS REF ALT FILTER QUAL AC AF AN DB DP DS FS AA0243Z.AD AA0243Z.PL AA0243Z.DP AA0243Z.GQ AA0243Z.GT 1 1267325 G GC PASS 1746.26 6 0.026 234 true 1656 NA 3.811 38,0 0,93,1395 38 93 G/G 1 1684347 CCCT CCCTCCT PASS 106993.14 93 0.381 244 NA 5639 NA 0.539 [[[email protected]] [[[email protected]] 33 99 CCCT/CCCT 1 1684347 CCCT C PASS 106993.14 2 8.197e-03 244 NA 5639 NA 0.539 [[[email protected]] [[[email protected]] 33 99 CCCT/CCCT 1 1684347 CCCT CCCTCCTCCT PASS 106993.14 4 0.016 244 NA 5639 NA 0.539 [[[email protected]] [[[email protected]] 33 99 CCCT/CCCT

    Output without --splitMultiAllelic:
    CHROM POS REF ALT FILTER QUAL AC AF AN DB DP DS FS AA0243Z.AD AA0243Z.PL AA0243Z.DP AA0243Z.GQ AA0243Z.GT 1 1267325 G GC PASS 1746.26 6 0.026 234 true 1656 NA 3.811 38,0 0,93,1395 38 93 G/G 1 1684347 CCCT CCCTCCT,C,CCCTCCTCCT PASS 106993.14 93,2,4 0.381,8.197e-03,0.016 244 NA 5639 NA 0.539 33,0,0,0 0,99,923,99,923,923,99,923,923,923 33 99 CCCT/CCCT

    I'll be happy to upload files for testing (vcf produced with the gcvf pipeline and SelectVariants), and no worries if this is what -SMA does ;-)
    Many thanks again!

  • KlausNZKlausNZ Member ✭✭

    Dear Sheila, it also happens with variants that have only two ALT alleles.
    And of course it was the INFO field that I cut from the vcf records in my previous reply - sorry!

  • KlausNZKlausNZ Member ✭✭

    Hi Team, I made three attempts to leave (a single) follow-up comment on Sunday (maybe Sat your time). It seemed to require approval, but doesn't show up, so I wonder whether it was lost?

  • KlausNZKlausNZ Member ✭✭

    I thought I'd read about this before, just took me a long time to find ebank's comment on durtschi's post from 2012: http://gatkforums.broadinstitute.org/discussion/comment/1210/#Comment_1210. durtschi's command didn't include --SMA, so I wonder whether the old fix wasn't tested with that option (if --SMA existed in GATK2.2 at all).

  • KlausNZKlausNZ Member ✭✭

    Thanks Geraldine!

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    durtschi's command didn't include --SMA, so I wonder whether the old fix wasn't tested with that option (if --SMA existed in GATK2.2 at all)

    Yep this sounds very likely. @Sheila‌ is going to put in a bug ticket for this and hopefully we can get this fixed.

  • KlausNZKlausNZ Member ✭✭
  • SheilaSheila Broad InstituteMember, Broadie admin

    @KlausNZ‌

    Hi,

    I have just put in a bug report and will let you know when it is fixed.

    -Sheila

  • KlausNZKlausNZ Member ✭✭

    Thanks Shelia!

  • SheilaSheila Broad InstituteMember, Broadie admin

    @KlausNZ‌

    Hi,

    This issue has been fixed today. It will be available in tonight's nightly build: https://www.broadinstitute.org/gatk/nightly

    -Sheila

Sign In or Register to comment.