ArrayIndexOutOfBoundsException Mutect2

[Sorry for the repost - I can't seem to be able to edit my original question - asked yesterday]

I'm trying to run Mutect2 (GATK v3.7-0-gcfedb67 - installed via conda) on a tumour normal pair with a panel of normals.

As a first step (following https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_cancer_m2_MuTect2.php)
I'm calling each normal sample individually, and then I want to create a PON to run against my tumour sample.

However, when I run:

gatk -T MuTect2 -R <genome.fa> \
    -I:tumor <normal1.bam> \
    --artifact_detection_mode \
    -o <normal1_normal.vcf

I get the error:

INFO  15:49:51,810 HelpFormatter - --------------------------------------------------------------------------------
INFO  15:49:51,813 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.7-0-gcfedb67, Compiled 2016/12/12 11:21:18
INFO  15:49:51,814 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
INFO  15:49:51,814 HelpFormatter - For support and documentation go to https://software.broadinstitute.org/gatk
INFO  15:49:51,815 HelpFormatter - [Tue Jul 04 15:49:51 CEST 2017] Executing on Linux 3.2.0-0.bpo.4-amd64 amd64
INFO  15:49:51,815 HelpFormatter - OpenJDK 64-Bit Server VM 1.8.0_92-b15
INFO  15:49:51,819 HelpFormatter - Program Args: -T MuTect2 -R /data/kdi_prod/project_result/948/01.00/Analysis/Genomes/Dmel_6/gatk/dmel6.12.fa -I:tumor /data/kdi_prod/project_result/948/01.00/Analysis/Bwa/HUM/HUM-3.tagged.filt.SC.RG.bam --artifact_detection_mode -o /data/kdi_prod/project_result/948/01.00/Analysis/Analysis/Mutect2/HUM/HUM-3.normals.vcf
INFO  15:49:51,840 HelpFormatter - Executing as nriddifo@bi-calc00 on Linux 3.2.0-0.bpo.4-amd64 amd64; OpenJDK 64-Bit Server VM 1.8.0_92-b15.
INFO  15:49:51,842 HelpFormatter - Date/Time: 2017/07/04 15:49:51
INFO  15:49:51,843 HelpFormatter - --------------------------------------------------------------------------------
INFO  15:49:51,843 HelpFormatter - --------------------------------------------------------------------------------
INFO  15:49:51,891 GenomeAnalysisEngine - Strictness is SILENT
INFO  15:49:53,215 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO  15:49:53,223 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO  15:49:53,387 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.16
INFO  15:49:54,329 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
INFO  15:49:55,204 GenomeAnalysisEngine - Done preparing for traversal
INFO  15:49:55,205 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO  15:49:55,207 ProgressMeter -                 |      processed |    time |         per 1M |           |   total | remaining
INFO  15:49:55,208 ProgressMeter -        Location | active regions | elapsed | active regions | completed | runtime |   runtime
INFO  15:49:55,620 MuTect2 - Using global mismapping rate of 45 => -4.5 in log10 likelihood units
Using AVX accelerated implementation of PairHMM
INFO  15:49:56,535 VectorLoglessPairHMM - libVectorLoglessPairHMM unpacked successfully from GATK jar file
INFO  15:49:56,535 VectorLoglessPairHMM - Using vectorized implementation of PairHMM
##### ERROR --
##### ERROR stack trace
java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at org.broadinstitute.gatk.utils.clipping.ClippingOp.hardClip(ClippingOp.java:380)
    at org.broadinstitute.gatk.utils.clipping.ClippingOp.apply(ClippingOp.java:117)
    at org.broadinstitute.gatk.utils.clipping.ReadClipper.clipRead(ReadClipper.java:157)
    at org.broadinstitute.gatk.utils.clipping.ReadClipper.hardClipSoftClippedBases(ReadClipper.java:334)
    at org.broadinstitute.gatk.utils.clipping.ReadClipper.hardClipSoftClippedBases(ReadClipper.java:337)
    at org.broadinstitute.gatk.tools.walkers.cancer.m2.MuTect2.finalizeActiveRegion(MuTect2.java:994)
    at org.broadinstitute.gatk.tools.walkers.cancer.m2.MuTect2.assembleReads(MuTect2.java:951)
    at org.broadinstitute.gatk.tools.walkers.cancer.m2.MuTect2.map(MuTect2.java:581)
    at org.broadinstitute.gatk.tools.walkers.cancer.m2.MuTect2.map(MuTect2.java:171)
    at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:709)
    at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:705)
    at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
    at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
    at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:274)
    at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78)
    at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:98)
    at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:316)
    at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:123)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:256)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:158)
    at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:108)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 3.7-0-gcfedb67):
##### ERROR
##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
##### ERROR Visit our website and forum for extensive documentation and answers to
##### ERROR commonly asked questions https://software.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Code exception (see stack trace for error itself)
##### ERROR ------------------------------------------------------------------------------------------

There seem to be a fair few problems in other tools that generate this error - but I haven't seen anything specific to Mutect2.

Any advice would be greatly appreciated!

Best Answer

  • nriddifordnriddiford ParisMember
    Accepted Answer

    Hi Shlee,

    After a bit of digging around I've found that it was the bam files that were the problem. I'm adding custom tags to the reads during alignment, and apparently these don't get on well with Mutect2. I've realigned with bwa and the run completes.
    Thanks for the help - you can mark this issue as closed.

    Nick

Answers

  • shleeshlee CambridgeMember, Broadie, Moderator

    Hi @nriddiford,

    Does your BAM validate with ValidateSamFile?

  • nriddifordnriddiford ParisMember

    Hi Shlee,

    Actually, the original file I was using did not validate with ValidateSamFile (Mate not found for paired read), but I have since tried with BAMs that pass ValidateSamFile, and get the same error as above:

    INFO  15:37:09,306 HelpFormatter - --------------------------------------------------------------------------------
    INFO  15:37:09,312 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.7-0-gcfedb67, Compiled 2016/12/12 11:21:18
    INFO  15:37:09,313 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
    INFO  15:37:09,313 HelpFormatter - For support and documentation go to https://software.broadinstitute.org/gatk
    INFO  15:37:09,314 HelpFormatter - [Wed Jul 12 15:37:09 CEST 2017] Executing on Linux 3.2.0-0.bpo.4-amd64 amd64
    INFO  15:37:09,315 HelpFormatter - OpenJDK 64-Bit Server VM 1.8.0_92-b15
    INFO  15:37:09,324 HelpFormatter - Program Args: -T MuTect2 -R /data/kdi_prod/project_result/948/01.00/Analysis/Genomes/Dmel_6/gatk/dmel6.12.fa -I:tumor HUM-1.tagged.RG.bam --artifact_detection_mode -o HUM-1.normals.vcf
    INFO  15:37:09,347 HelpFormatter - Executing as nriddifo@bi-calc00 on Linux 3.2.0-0.bpo.4-amd64 amd64; OpenJDK 64-Bit Server VM 1.8.0_92-b15.
    INFO  15:37:09,348 HelpFormatter - Date/Time: 2017/07/12 15:37:09
    INFO  15:37:09,348 HelpFormatter - --------------------------------------------------------------------------------
    INFO  15:37:09,349 HelpFormatter - --------------------------------------------------------------------------------
    INFO  15:37:09,383 GenomeAnalysisEngine - Strictness is SILENT
    INFO  15:37:10,303 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
    INFO  15:37:10,312 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO  15:37:10,462 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.15
    INFO  15:37:11,348 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
    INFO  15:37:12,039 GenomeAnalysisEngine - Done preparing for traversal
    INFO  15:37:12,040 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO  15:37:12,040 ProgressMeter -                 |      processed |    time |         per 1M |           |   total | remaining
    INFO  15:37:12,041 ProgressMeter -        Location | active regions | elapsed | active regions | completed | runtime |   runtime
    INFO  15:37:12,375 MuTect2 - Using global mismapping rate of 45 => -4.5 in log10 likelihood units
    ##### ERROR --
    ##### ERROR stack trace
    java.lang.ArrayIndexOutOfBoundsException
        at java.lang.System.arraycopy(Native Method)
        at org.broadinstitute.gatk.utils.clipping.ClippingOp.hardClip(ClippingOp.java:380)
        at org.broadinstitute.gatk.utils.clipping.ClippingOp.apply(ClippingOp.java:117)
        at org.broadinstitute.gatk.utils.clipping.ReadClipper.clipRead(ReadClipper.java:157)
        at org.broadinstitute.gatk.utils.clipping.ReadClipper.hardClipSoftClippedBases(ReadClipper.java:334)
        at org.broadinstitute.gatk.utils.clipping.ReadClipper.hardClipSoftClippedBases(ReadClipper.java:337)
        at org.broadinstitute.gatk.tools.walkers.cancer.m2.MuTect2.finalizeActiveRegion(MuTect2.java:994)
        at org.broadinstitute.gatk.tools.walkers.cancer.m2.MuTect2.assembleReads(MuTect2.java:951)
        at org.broadinstitute.gatk.tools.walkers.cancer.m2.MuTect2.map(MuTect2.java:581)
        at org.broadinstitute.gatk.tools.walkers.cancer.m2.MuTect2.map(MuTect2.java:171)
        at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:709)
        at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:705)
        at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
        at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
        at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:274)
        at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78)
        at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:98)
        at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:316)
        at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:123)
        at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:256)
        at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:158)
        at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:108)
    ##### ERROR ------------------------------------------------------------------------------------------
    ##### ERROR A GATK RUNTIME ERROR has occurred (version 3.7-0-gcfedb67):
    ##### ERROR
    ##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ##### ERROR Visit our website and forum for extensive documentation and answers to
    ##### ERROR commonly asked questions https://software.broadinstitute.org/gatk
    ##### ERROR
    ##### ERROR MESSAGE: Code exception (see stack trace for error itself)
    ##### ERROR ------------------------------------------------------------------------------------------
    
  • shleeshlee CambridgeMember, Broadie, Moderator

    @nriddiford,

    The error

    java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at org.broadinstitute.gatk.utils.clipping.ClippingOp.hardClip(ClippingOp.java:380)

    sounds like either (i) there is a mismatch in the CIGAR string and the read bases or (ii) we have a bug in the code.

    Would you mind prepping a bug report for us with the BAM that validates? Instructions are in Article#1894. We can look into the error better on our side if we can recapitulate it using our debugger.

    I have to mention to you that the new GATK4 Mutect2 has cleaner code and it's possible you may not encounter this error with GATK4-Mutect2. Remember that both GATK3 and GATK4's versions are in BETA. GATK4 syntax is different and Mutect2 functionalities have been split into two different tools--Mutect2 and FilterMutectCalls. There are some other differences that you should consider. I have just updated the workshop Mutect2 presentation (to be presented in the UK workshop going on now) and so you can start from there if you are interested. Let me attach a PDF of the slides for you.

  • nriddifordnriddiford ParisMember
    Accepted Answer

    Hi Shlee,

    After a bit of digging around I've found that it was the bam files that were the problem. I'm adding custom tags to the reads during alignment, and apparently these don't get on well with Mutect2. I've realigned with bwa and the run completes.
    Thanks for the help - you can mark this issue as closed.

    Nick

  • shleeshlee CambridgeMember, Broadie, Moderator

    Glad you worked that out @nriddiford. I should think our tools would ignore custom tags so long as the records meet SAM/BAM format specifications. To help improve our tools, would you mind posting some example records that cause the error? Much appreciated.

Sign In or Register to comment.