Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
GenotypeGVCFs WARN Track variant doesn't have a sequence dictionary built in
Hi Team, I'm getting `WARN 21:19:30,478 IndexDictionaryUtils - Track variant doesn't have a sequence dictionary built in, skipping dictionary validation` when processing gzipped g.vcf files produced by HaplotypeCaller (via -o foo.g.vcf.gz, as suggested by @Geraldine_VdAuwera in blog post 3893) with GenotypeGVCFs. This results in dramatic increases in run time (makes sense if GenotypeGVCFs un-compresses the files), and memory requirements (why ??) for GenotypeGVCFs compared to processing the gvcf for same bam files if HC outfiles are unzipped. Most batches that previously completed with 4x8GB RAM now produce `java.lang.OutOfMemoryError: Java heap space` errors even with 4X64GB! Could you please advise whether this warning is expected behaviour? If yes, what exactly is missing (can't see much difference in unzipped vs gzipped vcf headers), and can this be added somehow?