Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

ASEReadcounter error

Hi, I am having a similar problem as in this thread
I am runnung ASEReadcounter on RNA-seq data and I get this error

  • gatk ASEReadCounter -I /mnt/beegfs/Steph_WKDIR/1XXXXXXX_Single.bam -V filtered_Phased_1831.vcf.gz -R /mnt/XXXXXXs/Genomes/genome_hg19/hg19.fa -O ASE_1831.csv
    14:56:58.684 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/XXXXXXX/tools/gatk-4.1.3.0/gatk-package-4.1.3.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Oct 25, 2019 2:57:00 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    14:57:00.552 INFO ASEReadCounter - ------------------------------------------------------------
    14:57:00.553 INFO ASEReadCounter - The Genome Analysis Toolkit (GATK) v4.1.3.0
    14:57:00.554 INFO ASEReadCounter - For support and documentation go to https://software.broadinstitute.org/gatk/
    14:57:00.555 INFO ASEReadCounter - Executing as [email protected] on Linux v3.10.0-514.el7.x86_64 amd64
    14:57:00.556 INFO ASEReadCounter - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_102-b14
    14:57:00.557 INFO ASEReadCounter - Start Date/Time: October 25, 2019 2:56:58 PM PDT
    14:57:00.558 INFO ASEReadCounter - ------------------------------------------------------------
    14:57:00.559 INFO ASEReadCounter - ------------------------------------------------------------
    14:57:00.560 INFO ASEReadCounter - HTSJDK Version: 2.20.1
    14:57:00.561 INFO ASEReadCounter - Picard Version: 2.20.5
    14:57:00.562 INFO ASEReadCounter - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    14:57:00.563 INFO ASEReadCounter - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    14:57:00.563 INFO ASEReadCounter - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    14:57:00.564 INFO ASEReadCounter - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    14:57:00.565 INFO ASEReadCounter - Deflater: IntelDeflater
    14:57:00.566 INFO ASEReadCounter - Inflater: IntelInflater
    14:57:00.570 INFO ASEReadCounter - GCS max retries/reopens: 20
    14:57:00.571 INFO ASEReadCounter - Requester pays: disabled
    14:57:00.571 INFO ASEReadCounter - Initializing engine
    WARNING: BAM index file /mnt/XXXXE_Single.bai is older than BAM /mnt/XXXXX_WKDIR/1831_CD4_NAIVE_Single.bam
    14:57:01.356 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/XXXXX/filtered_Phased_1831.vcf.gz
    14:57:01.521 INFO ASEReadCounter - Done initializing engine
    14:57:01.523 INFO ProgressMeter - Starting traversal
    14:57:01.524 INFO ProgressMeter - Current Locus Elapsed Minutes Loci Processed Loci/Minute
    14:57:06.958 INFO ASEReadCounter - Shutting down engine
    [October 25, 2019 2:57:06 PM PDT] org.broadinstitute.hellbender.tools.walkers.rnaseq.ASEReadCounter done. Elapsed time: 0.14 minutes.
    Runtime.totalMemory()=3250061312

A USER ERROR has occurred: More then one variant context at position: chr1:11125729


Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
Using GATK jar /......and so on. Please do not pay attention to the paths :)

1- I am sure that my vcf does not have duplicate variants. I used SelectVariants to remove multiallelic variants and awk to remove duplicated variants by location. For example, this is my output for this location
$ zcat myvcf.vcf.gz |grep 11125729
chr1 11125729 rs2039841:11125729:T:C T C . PASS . GT 0|1

2- I get ASE output up to this position, so that might rule out formatting?
In fact, this error came up in a different position
(chr1 9355278 rs4080311:9355278:T:A T A . PASS . GT 0|1);
which I removed and reran on the new vcf. This lead to more output and stopped on this locus. I have whole human genomes, so it is not practical to manually remove loci.
Please help. Any input is welcome
Thanks

Tagged:

Best Answers

  • Steph_UCSteph_UC
    edited October 28 Accepted Answer

    Thanks for your feedback bhanuGandham.
    Update: here is an overview of the solution:
    Removing duplicate by position was successful but the reason is is not successful is this location ovelap an indel at the same location.
    gatk SelectVariants -R XXX/hg19/hg19.fa -V XXX_hg19.recode.vcf.gz -sn Mysample -L chr1:11125729 -O XXX/chr1_11125729.vcf

    CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Mysample
    chr1 11125727 rs138698709:11125727:TTTAG:T TTTAG T . PASS AC=1;AF=0.500;AN=2 GT 0|1
    chr1 11125729 rs2039841:11125729:T:C T C . PASS AC=1;AF=0.500;AN=2 GT 0|1

    I am using option -select-type SNP of SelectVariants on the whole dataset before ASEReadcounter and will update if that works

Answers

  • Steph_UCSteph_UC Member
    edited October 28 Accepted Answer

    Thanks for your feedback bhanuGandham.
    Update: here is an overview of the solution:
    Removing duplicate by position was successful but the reason is is not successful is this location ovelap an indel at the same location.
    gatk SelectVariants -R XXX/hg19/hg19.fa -V XXX_hg19.recode.vcf.gz -sn Mysample -L chr1:11125729 -O XXX/chr1_11125729.vcf

    CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Mysample
    chr1 11125727 rs138698709:11125727:TTTAG:T TTTAG T . PASS AC=1;AF=0.500;AN=2 GT 0|1
    chr1 11125729 rs2039841:11125729:T:C T C . PASS AC=1;AF=0.500;AN=2 GT 0|1

    I am using option -select-type SNP of SelectVariants on the whole dataset before ASEReadcounter and will update if that works

Sign In or Register to comment.