The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Bug: Haplotype Caller Odd VCF files

aeonsimaeonsim Member Posts: 68 ✭✭✭
edited April 2014 in Ask the GATK team

Hi I've just been having a go with the new Haplotype Caller method and I've getting some odd or malformed lines in the VCF file for example:

For example the format line has declared we should have 4 fields for each Sample record but instead we have samples with 2 records.

Two examples are shown here:

See Block 1
GT:DP:GQ:PL 1/1:.:3:32,3,0 ./.:0

Or where the format block declares 5 fields and we get 3 instead:

GT:AD:DP:GQ:PL  0/0:1,1,0,0,0,0,0,0,0:2:3:0,3,24,3,24,24        ./.:.:0 0/0:1,0,0,0,0,0,0,0,0:1:1:0,1,40,3,41,43        ./.:.:3

Full blocks, Block 1

chr1    3489714 .       G       A       9667.23 .       AC=154;AF=1.00;AN=154;DP=0;FS=0.000;InbreedingCoeff=-0.0391;MLEAC=154;MLEAF=1.00;MQ=0.00;MQ0=0;EFF=INTRAGENIC(MODIFIER|||||TIAM1||CODING|||1),INTRON(MODIFIER||||649|TIAM1|protein_coding|CODING|ENSBTAT00000064124|1|1);CSQ=A|ENSBTAG00000017839|ENSBTAT00000064124|Transcript|intron_variant||||||||1/5||1|TIAM1|HGNC|      GT:DP:GQ:PL     1/1:.:3:32,3,0  1/1:.:18:207,18,0     1/1:.:6:78,6,0  1/1:.:12:140,12,0       1/1:.:9:101,9,0 ./.:0   1/1:.:9:97,9,0  1/1:.:9:96,9,0  1/1:.:21:244,21,0       1/1:.:12:138,12,0       1/1:.:12:124,12,0       1/1:.:9:105,9,0 1/1:.:15:164,15,0     1/1:.:15:153,15,0       1/1:.:27:265,27,0       1/1:.:12:125,12,0       1/1:.:18:214,18,0       ./.:0   1/1:.:9:108,9,0 1/1:.:15:169,15,0       ./.:0   1/1:.:6:76,6,0  1/1:.:6:66,6,0  1/1:.:12:140,12,0     1/1:.:3:28,3,0  1/1:.:3:10,3,0  1/1:.:12:128,12,0       ./.:0   1/1:.:18:181,18,0       1/1:.:9:98,9,0  1/1:.:15:161,15,0       1/1:.:15:185,15,0       1/1:.:12:133,12,0       1/1:.:15:175,15,0     1/1:.:18:178,18,0       1/1:.:12:133,12,0       1/1:.:9:105,9,0 1/1:.:12:141,12,0       1/1:.:15:166,15,0       1/1:.:9:108,9,0 1/1:.:15:160,15,0       1/1:.:27:267,27,0       1/1:.:21:218,21,0    1/1:.:9:107,9,0  1/1:.:3:28,3,0  1/1:.:9:80,9,0  1/1:.:6:46,6,0  ./.:0   1/1:.:6:61,6,0  1/1:.:21:241,21,0       1/1:.:15:161,15,0       1/1:.:6:82,6,0  1/1:.:12:143,12,0       1/1:.:9:109,9,0 1/1:.:21:249,21,0     1/1:.:6:40,6,0  1/1:.:9:94,9,0  1/1:.:15:185,15,0       1/1:.:12:129,12,0       1/1:.:12:132,12,0       ./.:0   1/1:.:21:207,21,0       1/1:.:12:136,12,0       1/1:.:12:109,12,0       1/1:.:18:192,18,0     ./.:0   1/1:.:9:68,9,0  1/1:.:12:138,12,0       1/1:.:6:73,6,0  1/1:.:9:105,9,0 1/1:.:9:98,9,0  1/1:.:6:65,6,0  ./.:0   1/1:.:6:65,6,0  ./.:0   1/1:.:6:58,6,0  1/1:.:12:131,12,0       ./.:0   ./.:01/1:.:3:38,3,0   1/1:.:3:37,3,0  1/1:.:21:227,21,0       1/1:.:12:131,12,0       1/1:.:6:66,6,0  1/1:.:9:100,9,0 1/1:.:21:209,21,0       1/1:.:6:63,6,0  1/1:.:6:69,6,0

Block 2

chr1    55248   .       ACCC    A,CCCC  179.69  .       AC=13,6;AF=0.100,0.046;AN=130;BaseQRankSum=0.736;ClippingRankSum=0.736;DP=347;FS=0.000;InbreedingCoeff=0.2231;MLEAC=10,4;MLEAF=0.077,0.031;MQ=53.55;MQ0=0;MQRankSum=0.736;QD=4.99;ReadPosRankSum=0.736;EFF=INTERGENIC(MODIFIER||||||||||1),INTERGENIC(MODIFIER||||||||||2);CSQ=-||||intergenic_variant|||||||||||||     GT:AD:DP:GQ:PL  0/0:1,1,0,0,0,0,0,0,0:2:3:0,3,24,3,24,24        ./.:.:0 0/0:1,0,0,0,0,0,0,0,0:1:1:0,1,40,3,41,43        0/0:1,0,0,0,0,0,0,0,0:1:2:0,2,44,3,45,46        0/0:.:5:0:0,0,103,0,103,103     0/0:3,0,1,0,0,0,0,0,0:4:9:0,9,81,9,81,81        0/0:0,0,1,0,0,0,0,0,0:1:1:0,1,2,1,2,2   ./.:.:1 0/0:.:3:0:0,0,41,0,41,41        0/0:1,0,0,0,0,0,0,0,0:1:9:0,9,73,9,73,73        0/1:1,0,0,1,0,0,0,0,0:2:22:28,0,73,28,22,46     0/0:1,0,0,0,0,0,0,0,0:1:4:0,4,25,4,25,25        0/0:2,0,0,0,0,0,0,0,0:2:21:0,21,273,21,273,273  0/1:5,0,0,1,0,0,0,0,0:6:11:11,0,235,26,158,175  ./.:0,0,0,0,1,0,0,0,0:1 0/0:.:4:0:0,0,46,0,46,46        ./.:.:3 ./.:.:2 0/1:1,0,1,1,0,0,0,0,0:3:21:28,0,44,31,21,52     0/0:.:4:0:0,0,77,0,77,77        0/0:0,0,1,0,0,0,0,0,0:1:1:0,1,2,1,2,2   0/0:.:2:6:0,6,51,6,51,51        0/0:.:6:2:0,2,151,2,151,151     0/0:2,0,1,0,0,0,0,0,0:3:7:0,7,74,7,74,74        ./.:.:0 0/0:2,0,0,0,0,0,0,0,0:2:5:0,7,59,5,37,35        1/2:0,0,0,1,0,0,0,0,0:1:1:27,1,40,26,0,25       0/0:2,0,0,0,0,0,0,0,0:2:9:0,9,83,9,83,83        ./.:.:0 2/2:0,0,0,0,0,3,0,0,0:3:9:59,59,59,9,9,0        ./.:.:16        0/0:2,0,0,0,0,0,0,0,0:2:7:0,7,59,7,59,59        0/0:2,0,0,0,0,0,0,0,0:2:9:0,9,83,9,83,83        0/0:2,0,0,0,0,0,0,0,0:2:11:0,11,68,11,68,68     0/2:6,0,0,0,0,2,0,0,0:8:18:18,39,384,0,346,340  ./.:.:1 0/1:3,0,0,1,0,0,0,0,0:4:25:25,0,96,34,100,134   0/0:2,0,0,0,0,0,0,0,0:2:12:0,12,105,12,105,105  0/1:1,0,0,1,0,0,0,0,0:2:17:17,0,94,21,26,44     0/0:.:2:6:0,6,64,6,64,64        0/0:1,0,0,0,0,0,0,0,0:1:2:0,2,24,3,25,27        0/0:.:2:6:0,6,48,6,48,48        0/0:.:2:6:0,6,63,6,63,63        0/0:0,0,1,0,0,0,0,0,0:1:0:0,0,1,0,1,1   ./.:.:0 0/0:1,0,0,0,0,0,0,0,0:1:1:0,1,6,3,8,9   0/0:1,0,0,0,0,0,0,0,0:1:15:0,15,124,15,124,124  ./.:.:0 0/0:1,0,0,0,0,0,0,0,0:1:4:0,4,19,4,19,19        0/1:2,0,0,1,0,0,0,0,0:3:28:28,0,66,34,70,104    0/0:.:4:0:0,0,18,0,18,18        0/2:2,0,0,0,0,4,0,0,0:6:57:68,74,143,0,69,57    ./.:.:0 0/0:4,0,0,0,0,0,0,0,1:5:15:0,15,109,15,109,109  0/0:.:10:0:0,0,182,0,182,182    0/0:.:1:3:0,3,31,3,31,31        0/0:0,0,1,0,0,0,0,0,0:1:1:0,1,2,1,2,2   0/0:2,0,0,0,0,0,0,0,0:2:5:0,5,76,6,78,79        0/0:1,1,0,0,0,0,0,0,0:2:4:0,4,28,4,28,28        ./.:.:1 1/1:0,0,0,2,0,0,0,0,0:2:6:67,6,0,67,6,67        ./.:.:1 0/0:.:2:0:0,0,14,0,14,14        ./.:.:0 ./.:.:1 0/0:.:5:1:0,1,120,1,120,120     0/0:.:4:0:0,0,8,0,8,8   0/0:.:6:0:0,0,94,0,94,94        0/1:1,0,0,2,0,0,0,0,0:3:1:27,0,1,29,6,35        0/0:.:2:0:0,0,3,0,3,3   ./.:.:6 ./.:.:0 ./.:.:0 0/2:3,0,0,0,0,2,0,0,0:5:27:46,27,89,0,62,81     ./.:.:0 0/0:.:1:0:0,0,6,0,6,6   0/0:.:9:0:0,0,86,0,86,86        ./.:.:1 ./.:.:0 1/1:.:.:0:1,1,0,1,0,0   0/0:0,0,0,0,0,0,0,0,0:0:3:0,3,26,3,26,26        0/1:4,0,0,2,0,0,0,0,0:6:49:49,0,93,61,100,160   0/0:.:3:0:0,0,14,0,14,14        0/0:.:8:0:0,0,60,0,60,60        0/0:2,1,0,0,0,0,0,0,0:3:6:0,6,37,6,37,37        ./.:.:2 0/0:.:9:0:0,0,81,0,81,81        0/0:0,0,0,0,0,0,0,0,0:0:1:0,1,2,1,2,2

Any idea what the issue is?

Best Answer

Answers

  • aeonsimaeonsim Member Posts: 68 ✭✭✭
    edited April 2014

    Uhm the text in this post isn't showing for some reason?? When I go to edit the above post I see all the text for my post but for some reason I can't see it in the main thread view.

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,163 admin

    Some formatting weirdness, I think the Markdown interpreter didn't like the characters you used to delimit blocks. Fixed now.

    Geraldine Van der Auwera, PhD

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,163 admin

    Two questions: what version was this, and what were the individual steps/ command lines that produced this?

    Geraldine Van der Auwera, PhD

  • aeonsimaeonsim Member Posts: 68 ✭✭✭

    Strange I was getting ngnix bad gateway error messages when trying to edit the post. Still now it's showing it can at least be read.

  • aeonsimaeonsim Member Posts: 68 ✭✭✭

    GATK 3.1-1 was used for the HaplotypeCaller stages, though it was run on older BAM files that had been prepared with 2.7.4. I'll track down the exact commands in a couple of minutes.

  • aeonsimaeonsim Member Posts: 68 ✭✭✭

    @pdexheimer said:
    The VCF spec allows trailing fields in the format block to be omitted if they have no data...

    It does? That's plain nasty, so this is working as intended?

    I take it we are to assume that if the format line is: GT:AD:DP:GQ:PL and the data line is ./.:.:2 then that is GT:AD:DP

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,163 admin

    Aaand @pdexheimer jumps in with the answer :)

    The VCF spec allows trailing fields in the format block to be omitted if they have no data...

    I wish it didn't because it's confusing (and I always forget this, which is embarrassing).

    Geraldine Van der Auwera, PhD

  • pdexheimerpdexheimer Member, Dev Posts: 543 ✭✭✭✭

    @aeonsim said:
    I take it we are to assume that if the format line is: GT:AD:DP:GQ:PL and the data line is ./.:.:2 then that is GT:AD:DP

    Yep. Order is preserved, and GT is special because it must always be present

    @Geraldine_VdAuwera said:
    I wish it didn't because it's confusing

    Agreed. Makes parsing (by hand or machine) more complex, and I have a hard time finding any real benefits. It saves a couple of bytes, I suppose

  • aeonsimaeonsim Member Posts: 68 ✭✭✭

    @pdexheimer said:

    What a pain, that's going to take a bit of re-factoring to handle that in my code, was already bad enough with supporting multiple variant callers...

    Next time they update the VCF spec please suggest to them that remove that bit.

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,163 admin

    It saves a couple of bytes, I suppose

    pained laughter Sure. Talk about penny wise and pound foolish...

    Geraldine Van der Auwera, PhD

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,163 admin

    Next time they update the VCF spec please suggest to them that remove that bit.

    I'll see if the team is receptive to not applying that particular latitude of the spec's in future versions. There's no good reason we have to omit those fields; I would prefer we emit user-friendly VCFs.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.