We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

VariantsToTable returns multi-allelic AD and PL as [[[email protected]]

Hi Team,

When asking VariantsToTable to return `-GF AD` and `-GF PL`, these fields are represented in the outfile as '[[[email protected]]', '[[[email protected]]', etc. This appears to be limited to multi-allelic variants, and happens with versions 3.3-0, 3.3-3, 3.1-1, and 2.7-1.

Is this expected behaviour, and if yes, is there a way to 'decode' [[[email protected]]?

Many thanks in advance, and thanks for the great tools!

Tagged:

Best Answer

Answers

  • KlausNZKlausNZ Member ✭✭

    oops, replace 3.3-3 wth 3.2-2, please

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @KlausNZ‌

    Hi,

    I just tried this, and I did not get the error you are getting. Can you please post your command line?

    Thanks,
    Sheila

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @KlausNZ‌

    Hi again,

    I tested on 2 variant alleles. Does this happen only when there are more than 2 variant alleles?

    -Sheila

  • KlausNZKlausNZ Member ✭✭

    Hi Shelia,
    Sorry for the delay - I have been paring back the rather long command and test it on vcfs with fewer samples. The problem still occurs, but I noticed that it only happens with the --splitMultiAllelic option (wanting the counts for split alleles is what I'm after, unfortunately). The commands are (with/without --splitMultiAllelic)
    -R $GATKREF \ -T VariantsToTable \ --splitMultiAllelic \ --allowMissingData \ --variant t.vcf \ --out t.tab \ -F CHROM \ -F POS \ -F REF \ -F ALT \ -F FILTER \ -F QUAL \ -GF AD \ -GF PL \ -GF DP \ -GF GQ \ -GF GT \ -F AC \ -F AF \ -F AN \ -F DB \ -F DP \ -F DS \ -F FS
    The vcf records are (I cut out the FILTER field here to aid visibility)
    #CHROM POS ID REF ALT QUAL FILTER FORMAT AA0243Z 1 1267325 rs200330269 G GC 1746.26 PASS GT:AD:DP:GQ:PL 0/0:38,0:38:93:0,93,1395 1 1684347 . CCCT CCCTCCT,C,CCCTCCTCCT 106993.14 PASS GT:AD:DP:GQ:PL 0/0:33,0,0,0:33:99:0,99,923,99,923,923,99,923,923,923 1 1850627 . CAGCGGCAGG C 5063.71 VQSRTrancheINDEL99.00to99.90 GT:AD:DP:GQ:PL ./.:0,0:0

    Output with --splitMultiAllelic:
    CHROM POS REF ALT FILTER QUAL AC AF AN DB DP DS FS AA0243Z.AD AA0243Z.PL AA0243Z.DP AA0243Z.GQ AA0243Z.GT 1 1267325 G GC PASS 1746.26 6 0.026 234 true 1656 NA 3.811 38,0 0,93,1395 38 93 G/G 1 1684347 CCCT CCCTCCT PASS 106993.14 93 0.381 244 NA 5639 NA 0.539 [[[email protected]] [[[email protected]] 33 99 CCCT/CCCT 1 1684347 CCCT C PASS 106993.14 2 8.197e-03 244 NA 5639 NA 0.539 [[[email protected]] [[[email protected]] 33 99 CCCT/CCCT 1 1684347 CCCT CCCTCCTCCT PASS 106993.14 4 0.016 244 NA 5639 NA 0.539 [[[email protected]] [[[email protected]] 33 99 CCCT/CCCT

    Output without --splitMultiAllelic:
    CHROM POS REF ALT FILTER QUAL AC AF AN DB DP DS FS AA0243Z.AD AA0243Z.PL AA0243Z.DP AA0243Z.GQ AA0243Z.GT 1 1267325 G GC PASS 1746.26 6 0.026 234 true 1656 NA 3.811 38,0 0,93,1395 38 93 G/G 1 1684347 CCCT CCCTCCT,C,CCCTCCTCCT PASS 106993.14 93,2,4 0.381,8.197e-03,0.016 244 NA 5639 NA 0.539 33,0,0,0 0,99,923,99,923,923,99,923,923,923 33 99 CCCT/CCCT

    I'll be happy to upload files for testing (vcf produced with the gcvf pipeline and SelectVariants), and no worries if this is what -SMA does ;-)
    Many thanks again!

  • KlausNZKlausNZ Member ✭✭

    Dear Sheila, it also happens with variants that have only two ALT alleles.
    And of course it was the INFO field that I cut from the vcf records in my previous reply - sorry!

  • KlausNZKlausNZ Member ✭✭

    Hi Team, I made three attempts to leave (a single) follow-up comment on Sunday (maybe Sat your time). It seemed to require approval, but doesn't show up, so I wonder whether it was lost?

  • KlausNZKlausNZ Member ✭✭

    I thought I'd read about this before, just took me a long time to find ebank's comment on durtschi's post from 2012: http://gatkforums.broadinstitute.org/discussion/comment/1210/#Comment_1210. durtschi's command didn't include --SMA, so I wonder whether the old fix wasn't tested with that option (if --SMA existed in GATK2.2 at all).

  • KlausNZKlausNZ Member ✭✭

    Thanks Geraldine!

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    durtschi's command didn't include --SMA, so I wonder whether the old fix wasn't tested with that option (if --SMA existed in GATK2.2 at all)

    Yep this sounds very likely. @Sheila‌ is going to put in a bug ticket for this and hopefully we can get this fixed.

  • KlausNZKlausNZ Member ✭✭
  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @KlausNZ‌

    Hi,

    I have just put in a bug report and will let you know when it is fixed.

    -Sheila

  • KlausNZKlausNZ Member ✭✭

    Thanks Shelia!

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @KlausNZ‌

    Hi,

    This issue has been fixed today. It will be available in tonight's nightly build: https://www.broadinstitute.org/gatk/nightly

    -Sheila

Sign In or Register to comment.