Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Question about CombineGVCFs results
I have a quick question about the results of an CombineGVCFs file while creating large background files. Prior to combining the files a region of the .gvcf file from HC looks like this.
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SRR070473 13 19408518 . A <NON_REF> . . END=19409266 GT:DP:GQ:MIN_DP:PL 0/0:0:0:0:0,0,0 13 19409267 . T <NON_REF> . . END=19409323 GT:DP:GQ:MIN_DP:PL 0/0:4:12:2:0,6,60 13 19409324 . C <NON_REF> . . END=19409400 GT:DP:GQ:MIN_DP:PL 0/0:15:42:8:0,24,299
If you noticed the first individual looks like this SRR070473, with a 0/0:0:0:0:0,0,0 recorded, but after combining the file in batches of 200, the same information will be recorded as no call.
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SRR070473 SRR070477 SRR070505 SRR070516 SRR070517 SRR070772 SRR070779 SRR070796 SRR0 13 19408518 . A <NON_REF> . . END=19408519 GT:DP:GQ:MIN_DP:PL ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./.
The issue I'm seeing is when you attempt to use GenotypeGVCFs I get the following an error similar to this if using the file containing no call notation.
##### ERROR MESSAGE: cannot merge genotypes from samples without PLs; sample ERR031932 does not have likelihoods at position 1:10929