Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

M2 error with canine germline resource and variants_for_contamination files

kmegqkmegq BroadMember, Broadie

Dear GATK team,

I am running the Mutect2 pipeline on canine tumor samples in Terra, using WDL version 2.5 and GATK version 4.1.2.0. I was able to run the pipeline successfully without entering a germline resource file or VCF of common variants for contamination, however, when I did add these files in, I got the following error:

java.lang.IndexOutOfBoundsException: Index: 5, Size: 5
    at java.util.ArrayList.rangeCheck(ArrayList.java:657)
    at java.util.ArrayList.get(ArrayList.java:433)
    at org.broadinstitute.hellbender.tools.walkers.mutect.SomaticGenotypingEngine.lambda$getGermlineAltAlleleFrequencies$31(SomaticGenotypingEngine.java:350)
    at java.util.stream.ReferencePipeline$6$1.accept(ReferencePipeline.java:244)
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:545)
    at java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
    at java.util.stream.DoublePipeline.toArray(DoublePipeline.java:506)
    at org.broadinstitute.hellbender.tools.walkers.mutect.SomaticGenotypingEngine.getGermlineAltAlleleFrequencies(SomaticGenotypingEngine.java:352)
    at org.broadinstitute.hellbender.tools.walkers.mutect.SomaticGenotypingEngine.getNegativeLogPopulationAFAnnotation(SomaticGenotypingEngine.java:335)
    at org.broadinstitute.hellbender.tools.walkers.mutect.SomaticGenotypingEngine.callMutations(SomaticGenotypingEngine.java:141)
    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.callRegion(Mutect2Engine.java:250)
    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2.apply(Mutect2.java:324)
    at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:308)
    at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:281)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1039)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
    at org.broadinstitute.hellbender.Main.main(Main.java:291)
Using GATK jar /root/gatk.jar defined in environment variable GATK_LOCAL_JAR
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx3000m -jar /root/gatk.jar Mutect2 -R gs://fc-0b0cb3ce-e2cb-4aef-a8b2-08e60d78e87c/Canis_lupus_familiaris_assembly3.fasta -I gs://fc-8268e82b-ed61-4e04-a8c9-a95a05c0952e/bda6f5ba-8928-45bf-a6b0-9fe67d8dd9a4/PreProcessingForVariantDiscovery_GATK4/cccdda67-56e1-4363-aa6c-46ce53ef8afd/call-GatherBamFiles/attempt-2/Abrams_cell.bam -tumor Abrams_1 --germline-resource gs://fc-0b0cb3ce-e2cb-4aef-a8b2-08e60d78e87c/canid_wgs_ref.1.0.no_samples.vcf.gz -pon gs://fc-afa03a31-404c-4a93-9f6a-31b673db5c69/b92f3c35-5813-455b-94dc-3de3b54f5f98/Mutect2_Panel/c9f21d8a-384e-4d17-a6f8-79a502698827/call-MergeVCFs/1-Mutect2_PON_2019-07-25T22-08-49.vcf -L gs://fc-afa03a31-404c-4a93-9f6a-31b673db5c69/f2138b33-3918-4f8a-9b87-1823a0084ac3/Mutect2/c4844164-ecad-4878-9e5d-cd134a7fb40d/call-SplitIntervals/glob-0fc990c5ca95eebc97c4c204e3e303e1/0000-scattered.interval_list -O output.vcf --f1r2-tar-gz f1r2.tar.gz --af-of-alleles-not-in-resource 0.0007 --downsampling-stride 20 --max-reads-per-alignment-start 6 --max-suspicious-reads-per-alignment-start 6`

The germline resource is a VCF of approximately 80 million SNPs and indels (including multi allelic sites) called from a large number of canine WGS. It is formatted as a VCF with no sample information:

chr1    240     .       TG      T       464.40  PASS    AC=4;AF=0.011;AN=332;BaseQRankSum=0.674;ClippingRankSum=0;DP=14798;ExcessHet=0.0026;FS=5.63;InbreedingCoeff=-0.005;MLEAC=14;MLEAF=0.017;MQ=7.49;MQRankSum=-0.967;QD=22.11;ReadPosRankSum=0.967;SOR=3.18

The VCF for variants for contamination is a subset of this VCF, with only biallelic SNPs with AF between 0.01 and 0.2. Initially, it was formatted the same as the above file.

As part of debugging, I tried removing everything from the INFO field of the variants for contamination file, except allele frequency, and I tried using that simplified VCF both for the germline resource and the variants for contamination file. This seemed to fix the index out of bounds error, but the job then failed at the filtering step, with the following error:

java.lang.IllegalArgumentException: log10p: Log10-probability must be 0 or less
    at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:724)
    at org.broadinstitute.hellbender.utils.MathUtils.log10BinomialProbability(MathUtils.java:934)
    at org.broadinstitute.hellbender.utils.MathUtils.binomialProbability(MathUtils.java:927)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ContaminationFilter.calculateErrorProbability(ContaminationFilter.java:56)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2VariantFilter.errorProbability(Mutect2VariantFilter.java:15)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.lambda$new$1(ErrorProbabilities.java:19)
    at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1321)
    at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.<init>(ErrorProbabilities.java:19)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.accumulateData(Mutect2FilteringEngine.java:141)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.nthPassApply(FilterMutectCalls.java:146)
    at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverse$0(MultiplePassVariantWalker.java:40)
    at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverseVariants$1(MultiplePassVariantWalker.java:77)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
    at java.util.Iterator.forEachRemaining(Iterator.java:116)
    at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
    at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverseVariants(MultiplePassVariantWalker.java:75)
    at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:40)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
    at org.broadinstitute.hellbender.Main.main(Main.java:291)
Using GATK jar /root/gatk.jar defined in environment variable GATK_LOCAL_JAR
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx6500m -jar /root/gatk.jar FilterMutectCalls -V gs://fc-afa03a31-404c-4a93-9f6a-31b673db5c69/0bbb4e0e-7293-4ce5-b81f-d722fcec561a/Mutect2/223610c8-ec63-4439-b339-9503ceb80828/call-MergeVCFs/Abrams_cell-unfiltered.vcf -R gs://fc-0b0cb3ce-e2cb-4aef-a8b2-08e60d78e87c/Canis_lupus_familiaris_assembly3.fasta -O Abrams_cell-filtered.vcf --contamination-table /cromwell_root/fc-afa03a31-404c-4a93-9f6a-31b673db5c69/0bbb4e0e-7293-4ce5-b81f-d722fcec561a/Mutect2/223610c8-ec63-4439-b339-9503ceb80828/call-CalculateContamination/contamination.table --tumor-segmentation /cromwell_root/fc-afa03a31-404c-4a93-9f6a-31b673db5c69/0bbb4e0e-7293-4ce5-b81f-d722fcec561a/Mutect2/223610c8-ec63-4439-b339-9503ceb80828/call-CalculateContamination/segments.table --ob-priors /cromwell_root/fc-afa03a31-404c-4a93-9f6a-31b673db5c69/0bbb4e0e-7293-4ce5-b81f-d722fcec561a/Mutect2/223610c8-ec63-4439-b339-9503ceb80828/call-LearnReadOrientationModel/artifact-priors.tar.gz -stats /cromwell_root/fc-afa03a31-404c-4a93-9f6a-31b673db5c69/0bbb4e0e-7293-4ce5-b81f-d722fcec561a/Mutect2/223610c8-ec63-4439-b339-9503ceb80828/call-MergeStats/merged.stats --filtering-stats filtering.stats --min-median-read-position 10

Both of these tests were run on an interval that included a single chromosome (approximately 24Mb).

Thank you for your help!

Best,
Kate

Issue · Github
by SChaluvadi

Issue Number
6098
State
open
Last Updated
Assignee
Array

Answers

Sign In or Register to comment.