Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Bug: No PLs Produced in CombineGVCFs Leads to Error in GenotypeGVCFs
I've run into the following bug while running GenotypeGVCFs:
##### ERROR MESSAGE: cannot merge genotypes from samples without PLs; sample <ID redacted> does not have likelihoods at position 1:1115551
The input file in question is a gVCF produced by merging a large number of smaller gVCFs using CombineGVCFs (all tasks were run using version 3.1). What's happening is that the position 1115551 doesn't exist in that particular sample:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT <Sample_ID> 1 1115550 . AC A,<NON_REF> 118.73 . BaseQRankSum=-0.377;DP=15;MLEAC=1,0;MLEAF=0.500,0.00;MQ=60.72;MQ0=0;MQRankSum=-1.093;ReadPosRankSum=-0.811 GT:AD:DP:GQ:PL:SB 0/1:7,6,0:13:99:156,0,188,177,207,384:4,3,3,3 1 1115552 . C <NON_REF> . . END=1115552 GT:DP:GQ:MIN_DP:PL 0/0:15:0:15:0,0,31
But when the sample is combined with other samples, that position gets filled in with a simple "0/0", without any PLs (or any of the other fields, including AD, DP, GQ, etc.), which causes the GenotypeGVCFs to choke.
I can imagine there might be other scenarios that will result in a "0/0" genotype field, so perhaps the easiest way to fix this would be to make sure that any "0/0" actually gets output as "./.:.:.:.:.".