Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Multisample VCF missing positions
In our group we're trying to decide on whether to use single sample or multi sample variant calling. We're using GATK 2.7.2 in combination with Queue 2.7.2. While comparing the results from both ways of calling I found that a whole section of chromosome X was missing in the multi sample VCF, which was present in the single sample VCFs. Looking back in the .vcf.out file (output from Queue) I found that the job running most of chromosome X ended abruptly without the usual MicroScheduler summary at the bottom. What worries me is that I got no warning of an incomplete VCF-chunk and that the final VCF file was created without any errors.
Another related issue is that I found multiple single positions missing from the multi sample VCF file which again are present in the single sample VCFs. I'm hoping you can shed some light on this matter, ofcourse I'm happy to supply more information if needed.