We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Question about CombineGVCFs results

Hello all,
I have a quick question about the results of an CombineGVCFs file while creating large background files. Prior to combining the files a region of the .gvcf file from HC looks like this.
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SRR070473 13 19408518 . A <NON_REF> . . END=19409266 GT:DP:GQ:MIN_DP:PL 0/0:0:0:0:0,0,0 13 19409267 . T <NON_REF> . . END=19409323 GT:DP:GQ:MIN_DP:PL 0/0:4:12:2:0,6,60 13 19409324 . C <NON_REF> . . END=19409400 GT:DP:GQ:MIN_DP:PL 0/0:15:42:8:0,24,299
If you noticed the first individual looks like this SRR070473, with a 0/0:0:0:0:0,0,0 recorded, but after combining the file in batches of 200, the same information will be recorded as no call.
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SRR070473 SRR070477 SRR070505 SRR070516 SRR070517 SRR070772 SRR070779 SRR070796 SRR0 13 19408518 . A <NON_REF> . . END=19408519 GT:DP:GQ:MIN_DP:PL ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./.
The issue I'm seeing is when you attempt to use GenotypeGVCFs I get the following an error similar to this if using the file containing no call notation.
##### ERROR MESSAGE: cannot merge genotypes from samples without PLs; sample ERR031932 does not have likelihoods at position 1:10929
Best Answer
-
Geraldine_VdAuwera Cambridge, MA admin
I see. Before running CombineGVCFs on these files, you need to first concatenate the chromosomes pieces that belong to the same sample, e.g. with CatVariants. Then you can use CombineGVCFs to combine across samples.
Answers
Hi @srynearson1,
This is a glitch that happens when invariant blocks of different sizes are combined. I believe this has been fixed in the dev version, so you should be able to bypass this by using a recent nightly version to combine GVCFs. Sorry for the inconvenience.
I downloaded the nightly version, last night and the problem persists, any other idea or methods a.k.a "how to combine" to overcome this issue.
The issue persists when you run CombineGVCFs with the latest nightly?
How many samples do you have in total?
I have ~3025 file broken up into regions of 25 (i.e. chr regions) which equals about 125 runs with file looking like:
chr16_region_ERR047816.raw.snps.indels.gvcf chr21_region_SRR099963.raw.snps.indels.gvcf chr6_region_SRR233094.raw.snps.indels.gvcf chrY_region_SRR765989.raw.snps.indels.gvcf chr16_region_SRR070473.raw.snps.indels.gvcf
I see. Before running CombineGVCFs on these files, you need to first concatenate the chromosomes pieces that belong to the same sample, e.g. with CatVariants. Then you can use CombineGVCFs to combine across samples.
Thanks @Geraldine_VdAuwera, that did the trick. However I was able to use CombineGVCFs to combine each region and then move to a master merge.