Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Exception Running CombineGVCFs

I am running CombineGVCFs on single sample gvcf files produced by Haplotype caller.

"java.lang.IllegalStateException: Key END found in VariantContext field INFO at NC_002971.4:6584 but this key isn't defined in the VCFHeader. We require all VCFs to have complete VCF headers by default."

See trace at end

This occurred on 4.0.0.0 and I updated to 4.0.2.1 with the same results.

It appears that CombineGVCFs does not like more than one character in the REF field - i.e. indicating a deletion.
I have 4 vcf files they all fail on the first deletion.
If I manually edit the file to remove the deletion it fails on the next deletion.

ValidateVariants does not object to the file.

There is no END Key anywhere in the file.

Extract of vcf and trace below.

thanks

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  2686_DSTL_8
NC_002971.4     1896    .       C       T       2193    .       AC=1;AF=1.00;AN=1;DP=61
NC_002971.4     2019    .       T       C       2226    .       AC=1;AF=1.00;AN=1;DP=60
NC_002971.4     3912    .       A       ACAGAG  3059.97 .       AC=1;AF=1.00;AN=1;DP=65
NC_002971.4     3915    .       G       GC      2885.97 .       AC=1;AF=1.00;AN=1;DP=64
NC_002971.4     3917    .       A       AC      2930.97 .       AC=1;AF=1.00;AN=1;DP=64
NC_002971.4     3920    .       T       C       2984.97 .       AC=1;AF=1.00;AN=1;DP=65
NC_002971.4     3940    .       A       C       2984    .       AC=1;AF=1.00;AN=1;DP=68
NC_002971.4     5423    .       C       T       2342    .       AC=1;AF=1.00;AN=1;DP=65
NC_002971.4     6584    .       AC      A       5057.97 .       AC=1;AF=1.00;AN=1;DP=12
NC_002971.4     6660    .       GT      G       3692.97 .       AC=1;AF=1.00;AN=1;DP=11
NC_002971.4     7087    .       A       G       319     .       AC=1;AF=1.00;AN=1;DP=8;
NC_002971.4     7712    .       G       C       307     .       AC=1;AF=1.00;AN=1;DP=8;
NC_002971.4     7974    .       C       T       1115    .       AC=1;AF=1.00;AN=1;DP=28
NC_002971.4     8056    .       G       GC      1126.97 .       AC=1;AF=1.00;AN=1;DP=26
NC_002971.4     8066    .       A       G       1500    .       AC=1;AF=1.00;AN=1;DP=33
NC_002971.4     8072    .       GGGAAAACA       G       1580.97 .       AC=1;AF=1.00;AN
NC_002971.4     8082    .       T       G       1590    .       AC=1;AF=1.00;AN=1;DP=36
$ gatk-launch CombineGVCFs     --java-options '-Djava.io.tmpdir=/database'     --reference=/bioinformatics/references.2018/GCF_000007765.2/GCF_000007765.2_ASM776v2_genomic.fna     --output=34_haplotype_caller/combined_samples.vcf     --variant 34_haplotype_caller/2686_DSTL_8.genotypes.vcf
Using GATK jar /users/pao207/miniconda2/envs/sequencing/share/gatk4-4.0.2.1-0/gatk-package-4.0.2.1-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -Djava.io.tmpdir=/database -jar /users/pao207/miniconda2/envs/sequencing/share/gatk4-4.0.2.1-0/gatk-package-4.0.2.1-local.jar CombineGVCFs --reference=/bioinformatics/references.2018/GCF_000007765.2/GCF_000007765.2_ASM776v2_genomic.fna --output=34_haplotype_caller/combined_samples.vcf --variant 34_haplotype_caller/2686_DSTL_8.genotypes.vcf
16:42:35.207 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/users/pao207/miniconda2/envs/sequencing/share/gatk4-4.0.2.1-0/gatk-package-4.0.2.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
16:42:35.394 INFO  CombineGVCFs - ------------------------------------------------------------
16:42:35.394 INFO  CombineGVCFs - The Genome Analysis Toolkit (GATK) v4.0.2.1
16:42:35.395 INFO  CombineGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/
16:42:35.395 INFO  CombineGVCFs - Executing as [email protected] on Linux v2.6.32-358.2.1.el6.x86_64 amd64
16:42:35.395 INFO  CombineGVCFs - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_121-b15
16:42:35.395 INFO  CombineGVCFs - Start Date/Time: March 27, 2018 4:42:35 PM BST
16:42:35.395 INFO  CombineGVCFs - ------------------------------------------------------------
16:42:35.395 INFO  CombineGVCFs - ------------------------------------------------------------
16:42:35.395 INFO  CombineGVCFs - HTSJDK Version: 2.14.3
16:42:35.395 INFO  CombineGVCFs - Picard Version: 2.17.2
16:42:35.396 INFO  CombineGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 1
16:42:35.396 INFO  CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
16:42:35.396 INFO  CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
16:42:35.396 INFO  CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
16:42:35.396 INFO  CombineGVCFs - Deflater: IntelDeflater
16:42:35.396 INFO  CombineGVCFs - Inflater: IntelInflater
16:42:35.396 INFO  CombineGVCFs - GCS max retries/reopens: 20
16:42:35.396 INFO  CombineGVCFs - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
16:42:35.396 INFO  CombineGVCFs - Initializing engine
16:42:35.950 INFO  FeatureManager - Using codec VCFCodec to read file file:///bioinformatics/sequencing/Projects/26/2686/34_haplotype_caller/2686_DSTL_8.genotypes.vcf
16:42:35.975 INFO  CombineGVCFs - Done initializing engine
16:42:36.639 INFO  ProgressMeter - Starting traversal
16:42:36.640 INFO  ProgressMeter -        Current Locus  Elapsed Minutes    Variants Processed  Variants/Minute
16:42:36.684 INFO  CombineGVCFs - Shutting down engine
[March 27, 2018 4:42:36 PM BST] org.broadinstitute.hellbender.tools.walkers.CombineGVCFs done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=1775239168
java.lang.IllegalStateException: Key END found in VariantContext field INFO at NC_002971.4:6584 but this key isn't defined in the VCFHeader.  We require all VCFs to have complete VCF headers by default.
        at htsjdk.variant.vcf.VCFEncoder.fieldIsMissingFromHeaderError(VCFEncoder.java:173)
        at htsjdk.variant.vcf.VCFEncoder.encode(VCFEncoder.java:112)
        at htsjdk.variant.variantcontext.writer.VCFWriter.add(VCFWriter.java:224)
        at org.broadinstitute.hellbender.tools.walkers.CombineGVCFs.endPreviousStates(CombineGVCFs.java:345)
        at org.broadinstitute.hellbender.tools.walkers.CombineGVCFs.createIntermediateVariants(CombineGVCFs.java:189)
        at org.broadinstitute.hellbender.tools.walkers.CombineGVCFs.apply(CombineGVCFs.java:134)
        at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.apply(MultiVariantWalkerGroupedOnStart.java:73)
        at org.broadinstitute.hellbender.engine.VariantWalkerBase.lambda$traverse$0(VariantWalkerBase.java:110)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
        at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
        at java.util.Iterator.forEachRemaining(Iterator.java:116)
        at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
        at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
        at org.broadinstitute.hellbender.engine.VariantWalkerBase.traverse(VariantWalkerBase.java:108)
        at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.traverse(MultiVariantWalkerGroupedOnStart.java:118)
        at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:893)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:135)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:180)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:199)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:159)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:202)
        at org.broadinstitute.hellbender.Main.main(Main.java:288)

Comments

Sign In or Register to comment.