Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

ASEReadCounter java error

heskettheskett Portland, Oregon. USAMember

I'm just running ASEReadCounter on an RNA-seq BAM that has undergone mark duplicates, add read groups, and splitNcigar reads. These java errors don't provide any help for the user

14:24:03.421 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/groups/Spellmandata/heskett/packages/share/gatk4-4.0.11.0-0/gatk-package-4.0.11.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
14:24:05.114 INFO  ASEReadCounter - ------------------------------------------------------------
14:24:05.114 INFO  ASEReadCounter - The Genome Analysis Toolkit (GATK) v4.0.11.0
14:24:05.114 INFO  ASEReadCounter - For support and documentation go to https://software.broadinstitute.org/gatk/
14:24:05.115 INFO  ASEReadCounter - Executing as [email protected] on Linux v3.10.0-862.14.4.el7.x86_64 amd64
14:24:05.115 INFO  ASEReadCounter - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
14:24:05.116 INFO  ASEReadCounter - Start Date/Time: March 8, 2019 2:24:03 PM PST
14:24:05.116 INFO  ASEReadCounter - ------------------------------------------------------------
14:24:05.116 INFO  ASEReadCounter - ------------------------------------------------------------
14:24:05.117 INFO  ASEReadCounter - HTSJDK Version: 2.16.1
14:24:05.117 INFO  ASEReadCounter - Picard Version: 2.18.13
14:24:05.117 INFO  ASEReadCounter - HTSJDK Defaults.COMPRESSION_LEVEL : 2
14:24:05.118 INFO  ASEReadCounter - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
14:24:05.118 INFO  ASEReadCounter - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
14:24:05.118 INFO  ASEReadCounter - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
14:24:05.118 INFO  ASEReadCounter - Deflater: IntelDeflater
14:24:05.118 INFO  ASEReadCounter - Inflater: IntelInflater
14:24:05.119 INFO  ASEReadCounter - GCS max retries/reopens: 20
14:24:05.119 INFO  ASEReadCounter - Requester pays: disabled
14:24:05.119 INFO  ASEReadCounter - Initializing engine
14:24:05.581 INFO  FeatureManager - Using codec VCFCodec to read file file:///home/groups/Spellmandata/heskett/replication.rnaseq/scripts/../platinum.genome/NA12878.nochr.vcf
14:24:05.604 INFO  ASEReadCounter - Done initializing engine
contig  position    variantID   refAllele   altAllele   refCount    altCount    totalCount  lowMAPQDepth    lowBaseQDepth   rawDepth    otherBases  improperPairs
14:24:05.604 INFO  ProgressMeter - Starting traversal
14:24:05.604 INFO  ProgressMeter -        Current Locus  Elapsed Minutes        Loci Processed      Loci/Minute
14:24:05.638 INFO  ASEReadCounter - Shutting down engine
[March 8, 2019 2:24:05 PM PST] org.broadinstitute.hellbender.tools.walkers.rnaseq.ASEReadCounter done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=1859649536
java.lang.ArrayIndexOutOfBoundsException: 0
    at org.broadinstitute.hellbender.engine.ReferenceContext.getBase(ReferenceContext.java:396)
    at org.broadinstitute.hellbender.tools.walkers.rnaseq.ASEReadCounter.apply(ASEReadCounter.java:183)
    at org.broadinstitute.hellbender.engine.LocusWalker.lambda$traverse$0(LocusWalker.java:176)
    at java.util.Iterator.forEachRemaining(Iterator.java:116)
    at org.broadinstitute.hellbender.engine.LocusWalker.traverse(LocusWalker.java:174)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
    at org.broadinstitute.hellbender.Main.main(Main.java:289)
Using GATK jar /home/groups/Spellmandata/heskett/packages/share/gatk4-4.0.11.0-0/gatk-package-4.0.11.0-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx20G -jar /home/groups/Spellmandata/heskett/packages/share/gatk4-4.0.11.0-0/gatk-package-4.0.11.0-local.jar ASEReadCounter -I ../alignments/gm12878.rep2Aligned.out.rg.sorted.markdup.bam --variant ../platinum.genome/NA12878.nochr.vcf
srun: error: exanode-3-1: task 0: Exited with exit code 3

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @heskett

    1) Looks like you have posted questions in 5 other threads. Please try to post in one thread at a time. It will help us get to your questions faster.
    2) Would you please try to run this with the latest GATK4.1version. Most of the bugs have been resolved with the latest version. If the error persists please post the exact command you are using and the entire error log.

  • heskettheskett Portland, Oregon. USAMember
    edited March 11

    OK updated to 4.1. Here is the whole command and error message.

    DBSNP VCF has multiple SNPs at single positions, listed as separate rows. GATK currently can't handle this and select variants biallelic doesn't help because they are on different rows.

    Secondly, I wrote a script to remove all duplicate sites, but ASEReadCount still fails:

    This causes failure message A USER ERROR has occurred: More then one variant context at position: 1:10231

    1 10230 rs775928745 AC A . . ASP;GENEINFO=DDX11L1:100287102;R5;RS=775928745;RSPOS=10231;SAO=0;SSR=0;VC=DIV;VP=0x050000020005000002000200;WGT=1;dbSNPBuildID=144
    1 10231 rs200279319 C A . . ASP;GENEINFO=DDX11L1:100287102;R5;RS=200279319;RSPOS=10231;SAO=0;SSR=0;VC=SNV;VP=0x050000020005000002000100;WGT=1;dbSNPBuildID=137
    1 10234 rs145599635 C T . . ASP;GENEINFO=DDX11L1:100287102;R5;RS=145599635;RSPOS=10234;SAO=0;SSR=0;VC=SNV;VP=0x050000020005000002000100;WGT=1;dbSNPBuildID=134

    ```srun --mem 20000 gatk ASEReadCounter -I ../alignments/gm12878.rep1Aligned.out.rg.sorted.markdup.splitn.bam -R /home/groups/Spellmandata/heskett/refs/hg38.10x.nochr.fa -V /home/groups/Spellmandata/heskett/refs/dbsnp.146.hg38.nochr.sorted.biallelic.vcf
    15:54:01.641 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/groups/Spellmandata/heskett/packages/share/gatk4-4.1.0.0-0/gatk-package-4.1.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    15:54:03.349 INFO ASEReadCounter - ------------------------------------------------------------
    15:54:03.349 INFO ASEReadCounter - The Genome Analysis Toolkit (GATK) v4.1.0.0
    15:54:03.350 INFO ASEReadCounter - For support and documentation go to https://software.broadinstitute.org/gatk/
    15:54:03.350 INFO ASEReadCounter - Executing as [email protected] on Linux v3.10.0-862.14.4.el7.x86_64 amd64
    15:54:03.350 INFO ASEReadCounter - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
    15:54:03.350 INFO ASEReadCounter - Start Date/Time: March 11, 2019 3:54:01 PM PDT
    15:54:03.350 INFO ASEReadCounter - ------------------------------------------------------------
    15:54:03.350 INFO ASEReadCounter - ------------------------------------------------------------
    15:54:03.351 INFO ASEReadCounter - HTSJDK Version: 2.18.2
    15:54:03.351 INFO ASEReadCounter - Picard Version: 2.18.25
    15:54:03.351 INFO ASEReadCounter - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    15:54:03.351 INFO ASEReadCounter - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    15:54:03.351 INFO ASEReadCounter - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    15:54:03.351 INFO ASEReadCounter - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    15:54:03.351 INFO ASEReadCounter - Deflater: IntelDeflater
    15:54:03.351 INFO ASEReadCounter - Inflater: IntelInflater
    15:54:03.351 INFO ASEReadCounter - GCS max retries/reopens: 20
    15:54:03.351 INFO ASEReadCounter - Requester pays: disabled
    15:54:03.351 INFO ASEReadCounter - Initializing engine
    15:54:03.818 INFO FeatureManager - Using codec VCFCodec to read file file:///home/groups/Spellmandata/heskett/refs/dbsnp.146.hg38.nochr.sorted.biallelic.vcf
    15:54:04.776 WARN IndexUtils - Feature file "/home/groups/Spellmandata/heskett/refs/dbsnp.146.hg38.nochr.sorted.biallelic.vcf" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
    15:54:05.627 INFO ASEReadCounter - Done initializing engine
    contig position variantID refAllele altAllele refCount altCount totalCount lowMAPQDepth lowBaseQDepth rawDepth otherBases improperPairs
    15:54:05.628 INFO ProgressMeter - Starting traversal
    15:54:05.628 INFO ProgressMeter - Current Locus Elapsed Minutes Loci Processed Loci/Minute
    15:54:05.708 WARN ASEReadCounter - Ignoring site: variant is not het at postion: 1:10055
    15:54:05.713 INFO ASEReadCounter - Shutting down engine
    [March 11, 2019 3:54:05 PM PDT] org.broadinstitute.hellbender.tools.walkers.rnaseq.ASEReadCounter done. Elapsed time: 0.07 minutes.
    Runtime.totalMemory()=1987575808


    A USER ERROR has occurred: More then one variant context at position: 1:10177


    Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
    Using GATK jar /home/groups/Spellmandata/heskett/packages/share/gatk4-4.1.0.0-0/gatk-package-4.1.0.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/groups/Spellmandata/heskett/packages/share/gatk4-4.1.0.0-0/gatk-package-4.1.0.0-local.jar ASEReadCounter -I ../alignments/gm12878.rep1Aligned.out.rg.sorted.markdup.splitn.bam -R /home/groups/Spellmandata/heskett/refs/hg38.10x.nochr.fa -V /home/groups/Spellmandata/heskett/refs/dbsnp.146.hg38.nochr.sorted.biallelic.vcf
    srun: error: exanode-2-44: task 0: Exited with exit code 2```

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @heskett

    Could you please try to use CombineVariants as recommended in this thread for the same error:https://gatkforums.broadinstitute.org/gatk/discussion/9399/about-asereadcounter

  • heskettheskett Portland, Oregon. USAMember

    Combine Variants is not available in GATK 4... And I'm only working with one VCF file. The example below shows variants at different sites still gives the error.

    1 10230 rs775928745 AC A . . ASP;GENEINFO=DDX11L1:100287102;R5;RS=775928745;RSPOS=10231;SAO=0;SSR=0;VC=DIV;VP=0x050000020005000002000200;WGT=1;dbSNPBuildID=144 1 10231 rs200279319 C A . . ASP;GENEINFO=DDX11L1:100287102;R5;RS=200279319;RSPOS=10231;SAO=0;SSR=0;VC=SNV;VP=0x050000020005000002000100;WGT=1;dbSNPBuildID=137 1 10234 rs145599635 C T . . ASP;GENEINFO=DDX11L1:100287102;R5;RS=145599635;RSPOS=10234;SAO=0;SSR=0;VC=SNV;VP=0x050000020005000002000100;WGT=1;dbSNPBuildID=134

    @bhanuGandham said:
    Hi @heskett

    Could you please try to use CombineVariants as recommended in this thread for the same error:https://gatkforums.broadinstitute.org/gatk/discussion/9399/about-asereadcounter

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @heskett

    CombineVariants has not been ported to GATK4 yet but you could use it from GATK3.

  • heskettheskett Portland, Oregon. USAMember

    Combine Variants is for combining variants from multiple sources. I have one source that has multiple variants from the same position (DBSNP database) and this causes GATK4 errors with asereadcounter.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    @heskett

    Would you please post the exact command you have used, the version and the entire error log please.

  • heskettheskett Portland, Oregon. USAMember

    OK I will post this again--it is in my post above. here is the entire command and full print out from GATK. if there is another way to get a longer error message it's not clear.

    gatk ASEReadCounter -I ../alignments/gm12878.rep1Aligned.out.final.bam -R /home/groups/Spellmandata/heskett/refs/hg38.10x.nochr.fa -V /home/groups/Spellmandata/heskett/refs/dbsnp.146.hg38.nochr.sorted.biallelic.vcf

    `12:01:23.271 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/groups/Spellmandata/heskett/packages/share/gatk4-4.1.0.0-0/gatk-package-4.1.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    12:01:24.978 INFO ASEReadCounter - ------------------------------------------------------------
    12:01:24.978 INFO ASEReadCounter - The Genome Analysis Toolkit (GATK) v4.1.0.0
    12:01:24.979 INFO ASEReadCounter - For support and documentation go to https://software.broadinstitute.org/gatk/
    12:01:24.979 INFO ASEReadCounter - Executing as [email protected] on Linux v3.10.0-862.14.4.el7.x86_64 amd64
    12:01:24.979 INFO ASEReadCounter - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
    12:01:24.979 INFO ASEReadCounter - Start Date/Time: March 25, 2019 12:01:23 PM PDT
    12:01:24.979 INFO ASEReadCounter - ------------------------------------------------------------
    12:01:24.979 INFO ASEReadCounter - ------------------------------------------------------------
    12:01:24.980 INFO ASEReadCounter - HTSJDK Version: 2.18.2
    12:01:24.980 INFO ASEReadCounter - Picard Version: 2.18.25

    12:01:24.980 INFO ASEReadCounter - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    12:01:24.980 INFO ASEReadCounter - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    12:01:24.980 INFO ASEReadCounter - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    12:01:24.980 INFO ASEReadCounter - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    12:01:24.980 INFO ASEReadCounter - Deflater: IntelDeflater
    12:01:24.980 INFO ASEReadCounter - Inflater: IntelInflater
    12:01:24.980 INFO ASEReadCounter - GCS max retries/reopens: 20
    12:01:24.980 INFO ASEReadCounter - Requester pays: disabled
    12:01:24.980 INFO ASEReadCounter - Initializing engine
    12:01:25.404 INFO FeatureManager - Using codec VCFCodec to read file file:///home/groups/Spellmandata/heskett/refs/dbsnp.146.hg38.nochr.sorted.biallelic.vcf
    12:01:26.192 WARN IndexUtils - Feature file "/home/groups/Spellmandata/heskett/refs/dbsnp.146.hg38.nochr.sorted.biallelic.vcf" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
    12:01:27.027 INFO ASEReadCounter - Done initializing engine
    12:01:27.027 INFO ProgressMeter - Starting traversal
    12:01:27.027 INFO ProgressMeter - Current Locus Elapsed Minutes Loci Processed Loci/Minute
    12:01:27.098 WARN ASEReadCounter - Ignoring site: variant is not het at postion: 1:10055
    12:01:27.103 INFO ASEReadCounter - Shutting down engine
    [March 25, 2019 12:01:27 PM PDT] org.broadinstitute.hellbender.tools.walkers.rnaseq.ASEReadCounter done. Elapsed time: 0.06 minutes.
    Runtime.totalMemory()=2152726528


    A USER ERROR has occurred: More then one variant context at position: 1:10177


    Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
    Using GATK jar /home/groups/Spellmandata/heskett/packages/share/gatk4-4.1.0.0-0/gatk-package-4.1.0.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/groups/Spellmandata/heskett/packages/share/gatk4-4.1.0.0-0/gatk-package-4.1.0.0-local.jar ASEReadCounter -I ../alignments/gm12878.rep1Aligned.out.final.bam -R /home/groups/Spellmandata/heskett/refs/hg38.10x.nochr.fa -V /home/groups/Spellmandata/heskett/refs/dbsnp.146.hg38.nochr.sorted.biallelic.vcf
    srun: error: exanode-3-4: task 0: Exited with exit code 2`

    ***As posted above, here is a screenshot of the VCF file at the position 1:10177. I have already used SelectVariants restrict to biallelic and it does not remove these. I also tried only including unique positions and it gave me the same error.

    greatly appreciate your help

    1 10177 rs201752861 A C . . ASP;GENEINFO=DDX11L1:100287102;R5;RS=201752861;RSPOS=10177;SAO=0;SSR=0;VC=SNV;VP=0x050000020005000002000100;WGT=1;dbSNPBuildID=137 1 10177 rs367896724 A AC . . ASP;CAF=0.5747,0.4253;COMMON=1;G5;G5A;GENEINFO=DDX11L1:100287102;KGPhase3;R5;RS=367896724;RSPOS=10177;SAO=0;SSR=0;VC=DIV;VLD;VP=0x050000020005170026000200;WGT=1;dbSNPBuildID=138

Sign In or Register to comment.