Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

GATKException: Somehow the requested coordinate is not covered by the read.

JoanGibertJoanGibert Member
edited September 2018 in Ask the GATK team


I am analysing Ion Torrent targeted panel data. I follow the best practices pipeline (https://software.broadinstitute.org/gatk/documentation/article?id=11136) but I got this error when I run Mutect2 in these files.

12:19:15.515 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/soft/EB_repo/bio/sequence/programs/foss/2016b/GATK/!/com/intel/gkl/native/libgkl_compression.so
12:19:17.186 INFO  Mutect2 - ------------------------------------------------------------
12:19:17.187 INFO  Mutect2 - The Genome Analysis Toolkit (GATK) v4.0.4.0
12:19:17.188 INFO  Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/
12:19:17.189 INFO  Mutect2 - Executing as [email protected] on Linux v3.10.0-693.21.1.el7.x86_64 amd64
12:19:17.192 INFO  Mutect2 - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_92-b14
12:19:17.194 INFO  Mutect2 - Start Date/Time: August 30, 2018 12:19:15 PM CEST
12:19:17.194 INFO  Mutect2 - ------------------------------------------------------------
12:19:17.194 INFO  Mutect2 - ------------------------------------------------------------
12:19:17.195 INFO  Mutect2 - HTSJDK Version: 2.14.3
12:19:17.196 INFO  Mutect2 - Picard Version: 2.18.2
12:19:17.196 INFO  Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:19:17.197 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:19:17.197 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:19:17.198 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:19:17.198 INFO  Mutect2 - Deflater: IntelDeflater
12:19:17.199 INFO  Mutect2 - Inflater: IntelInflater
12:19:17.199 INFO  Mutect2 - GCS max retries/reopens: 20
12:19:17.199 INFO  Mutect2 - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
12:19:17.200 INFO  Mutect2 - Initializing engine
12:19:19.295 INFO  FeatureManager - Using codec VCFCodec to read file file:///users/genomics/jgibert/Varis/af-only-gnomad.raw.sites.h19.vcf.gz
12:19:20.028 INFO  FeatureManager - Using codec BEDCodec to read file file:///users/genomics/jgibert/Varis/Oncomine_TML.20170222.designed.bed
12:19:21.127 INFO  IntervalArgumentCollection - Processing 1656280 bp from intervals
12:19:21.281 INFO  Mutect2 - Done initializing engine
12:19:22.708 INFO  NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/soft/EB_repo/bio/sequence/programs/foss/2016b/GATK/!/com/intel/gkl/native/libgkl_utils.so
12:19:22.928 INFO  PairHMM - OpenMP multi-threaded AVX-accelerated native PairHMM implementation is not supported
12:19:22.929 WARN  PairHMM - ***WARNING: Machine does not have the AVX instruction set support needed for the accelerated AVX PairHmm. Falling back to the MUCH slower LOGLESS_CACHING implementation!
12:19:23.836 INFO  ProgressMeter - Starting traversal
12:19:23.843 INFO  ProgressMeter -        Current Locus  Elapsed Minutes     Regions Processed   Regions/Minute
12:19:47.129 INFO  ProgressMeter -         chr1:6529491              0.4                    20             51.5
12:19:56.279 INFO  PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 17.687161069000002
12:19:56.279 INFO  SmithWatermanAligner - Total compute time in java Smith-Waterman : 4.22 sec
12:19:56.864 INFO  Mutect2 - Shutting down engine
[August 30, 2018 12:19:56 PM CEST] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.69 minutes.
org.broadinstitute.hellbender.exceptions.GATKException: Somehow the requested coordinate is not covered by the read. Alignment 6530911 | 55M1I36M
    at org.broadinstitute.hellbender.utils.read.ReadUtils.getReadCoordinateForReferenceCoordinate(ReadUtils.java:815)
    at org.broadinstitute.hellbender.utils.read.ReadUtils.getReadCoordinateForReferenceCoordinate(ReadUtils.java:691)
    at org.broadinstitute.hellbender.utils.read.ReadUtils.getReadCoordinateForReferenceCoordinateUpToEndOfRead(ReadUtils.java:649)
    at org.broadinstitute.hellbender.tools.walkers.annotator.BaseQualityRankSumTest.getReadBaseQuality(BaseQualityRankSumTest.java:43)
    at org.broadinstitute.hellbender.tools.walkers.annotator.BaseQualityRankSumTest.getElementForRead(BaseQualityRankSumTest.java:38)
    at org.broadinstitute.hellbender.tools.walkers.annotator.RankSumTest.getElementForRead(RankSumTest.java:108)
    at org.broadinstitute.hellbender.tools.walkers.annotator.RankSumTest.fillQualsFromLikelihood(RankSumTest.java:86)
    at org.broadinstitute.hellbender.tools.walkers.annotator.RankSumTest.annotate(RankSumTest.java:44)
    at org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:377)
    at org.broadinstitute.hellbender.tools.walkers.mutect.SomaticGenotypingEngine.callMutations(SomaticGenotypingEngine.java:167)
    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.callRegion(Mutect2Engine.java:182)
    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2.apply(Mutect2.java:183)
    at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:295)
    at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:271)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:892)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:134)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:179)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:198)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
    at org.broadinstitute.hellbender.Main.main(Main.java:289)
Using GATK jar /soft/EB_repo/bio/sequence/programs/foss/2016b/GATK/
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /soft/EB_repo/bio/sequence/programs/foss/2016b/GATK/ Mutect2 --af-of-alleles-not-in-resource 0.0000025 --germline-resource /users/genomics/jgibert/Varis/af-only-gnomad.raw.sites.h19.vcf.gz -A BaseQualityRankSumTest -A ClippingRankSumTest -A MappingQualityRankSumTest -A QualByDepth -A RMSMappingQuality -A ReadPosRankSumTest -A FisherStrand -A StrandOddsRatio -A InbreedingCoeff -I IonXpress_001.bam -R /users/genomics/jgibert/hg19_IOT/hg19_IOT.fasta -tumor IonXpress_001 -L /users/genomics/jgibert/Varis/Oncomine_TML.20170222.designed.bed -bamout /users/genomics/jgibert/data/TML_Pedro_samples/GATK/IonXpress_001_Mutect.bam -O /users/genomics/jgibert/data/TML_Pedro_samples/GATK/IonXpress_001_Mutect.vcf.gz

In other files I also get different error: GATKException: Reference coordinate corresponds to a non-existent base in the read.

I tried to use the last version of GATK but the error still persists. Any idea what's happening?

Post edited by shlee on

Best Answer


  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @JoanGibert,

    For starters, you should know our workflows do not support Ion Torrent data. Is there something different or special about your data than traditional Ion Torrent? Or are you performing QC diagnostics on your data?

    You can run ValidateSamFile on your BAMs to diagnose the issue, which I suspect relates to mismatches in expectations for CIGAR, read coordinates and/or reference coordinates. These mismatches in expectations arise from using tool-chains other than those recommended by our Best Practices. For example, we recommend alignment with BWA MEM.

  • Hi @shlee,

    Thanks for the quick answer. I am aware that GATK do not suppor IOT data, however this is just some kind of exploratory. Actually, I performed different analysis with IOT target panel data that give me good results.

    That's why I'm kind of surprised with this error. This is the first time that IOT data throws me it. I followed the best practices and I used BWA MEM for mapping. Moreover, ValidateSamFile do not show me any error, just a warning of the .bai due to cluster timmings:

    java -jar $EBROOTPICARD/picard.jar ValidateSamFile I=IonXpress_001_snaut.bam
    10:28:08.631 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/soft/EB_repo/bio/sequence/programs/noarch/picard/2.18.12-Java-1.8.0_92/picard.jar!/com/intel/gkl/native/libgkl_compression.so
    [Thu Sep 27 10:28:08 CEST 2018] Executing as [email protected] on Linux 3.10.0-693.5.2.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.18.12-SNAPSHOT
    WARNING: BAM index file /users/genomics/jgibert/data/TML_Pedro_samples/TML_fastq/IonXpress_001_snaut.bai is older than BAM /users/genomics/jgibert/data/TML_Pedro_samples/TML_fastq/IonXpress_001_snaut.bam
    No errors found
    [Thu Sep 27 10:30:03 CEST 2018] picard.sam.ValidateSamFile done. Elapsed time: 1.92 minutes.


  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭



    org.broadinstitute.hellbender.exceptions.GATKException: Somehow the requested coordinate is not covered by the read. Alignment 6530911 | 55M1I36M

    Would you mind posting the record with alignment start 6530911 and CIGAR string 55M1I36M? What we want to confirm is that the length outlined by the CIGAR string matches the read alignment. The error message implies that these do not match.

    Also, can you tell us something about the pre-processing of your data? For example, I see the label snaut on your BAM which is a label I've used in the past to indicate processing through SortSam and SetNmAndUqTags/SetNmMdAndUqTags. Did you by chance clip (soft or hard) your reads at any point during your pre-processing?

  • JoanGibertJoanGibert Member
    edited September 2018

    Hi @shlee,

    Here is the aligment record that you request:

    3E9Q2:00531:02051   1024    chr1    6530911 60  55M1I36M    *   0   0   TCAGCCTCTGGCATTGTGGGTGCTTCTCCGCCCACTGCGGTGGGGGAGTGGGGGCGGGGCTCAGGGCAGGCCCCGCCCCACCCGGCCCCGTC    7<<<<7;9==9=<<7<;>=7=<<<7;880778*5999<=8<<>>?1;<=????3>>[email protected]?>><<;6;<=7<<<2;;;;4;=?7?0888*8;:    MD:Z:13C77  PG:Z:MarkDuplicates RG:Z:IonXpress_001  NM:i:2  UQ:i:27 AS:i:79 XS:i:20

    Regarding the preprocessing, I followed the pipeline posted here: https://software.broadinstitute.org/gatk/documentation/article?id=8017

    In brief: BWA mapping, RevertSam, AddOrReplaceReadGroups, MergeBamAlignment, MarkDuplicates, SortSam, SetNmAndUqTags, BaseRecalibrator, ApplyBQSR and Mutect2

    In the MergeBamAlignment stem I specified CLIP_ADAPTERS=false

    Sorry in advance if I misunderstood something in the pipeline and thanks for your help :)

    Post edited by shlee on
  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @JoanGibert,

    Nothing is amiss with the read metadata. Looks like you are working with single-ended reads. Bases and base qualities each add up to 92, which your cigar string elements also add up to. The read aligns to the middle of human chromosome 1, away from edges. Finally, this read is marked as a duplicate.

    Given some of your other files give the error:

    Reference coordinate corresponds to a non-existent base in the read.

    And you are using a custom reference in your Mutect2 command:

    -R /users/genomics/jgibert/hg19_IOT/hg19_IOT.fasta

    Is there any possibility you are using a reference that is different from the reference you used in BWA mapping? You can check your @SQ header lines in the BAM against the reference dictionary to confirm.

    Another option that comes to mind is to see if Mutect2 still gives the error for a pre-processed BAM that hasn't undergone MergeBamAlignment. Are all of your reads single ended or do you also have paired end reads?

  • Hi @shlee,

    Here is the header of the bam file:

    @HD VN:1.5 SO:coordinate
    @SQ SN:chr1 LN:249250621 M5:1b22b98cdeb4a9304cb5d48026a85128 UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr2 LN:243199373 M5:a0d9851da00400dec1098a9255ac712e UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr3 LN:198022430 M5:fdfd811849cc2fadebc929bb925902e5 UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr4 LN:191154276 M5:23dccd106897542ad87d2765d28a19a1 UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr5 LN:180915260 M5:0740173db9ffd264d728f32784845cd7 UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr6 LN:171115067 M5:1d3a93a248d92a729ee764823acbbc6b UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr7 LN:159138663 M5:618366e953d6aaad97dbe4777c29375e UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr8 LN:146364022 M5:96f514a9929e410c6651697bded59aec UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr9 LN:141213431 M5:3e273117f15e0a400f01055d9f393768 UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr10 LN:135534747 M5:988c28e000e84c26d552359af1ea2e1d UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr11 LN:135006516 M5:98c59049a2df285c76ffb1c6db8f8b96 UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr12 LN:133851895 M5:51851ac0e1a115847ad36449b0015864 UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr13 LN:115169878 M5:283f8d7892baa81b510a015719ca7b0b UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr14 LN:107349540 M5:98f3cae32b2a2e9524bc19813927542e UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr15 LN:102531392 M5:e5645a794a8238215b2cd77acb95a078 UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr16 LN:90354753 M5:fc9b1a7b42b97a864f56b348b06095e6 UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr17 LN:81195210 M5:351f64d4f4f9ddd45b35336ad97aa6de UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr18 LN:78077248 M5:b15d4b2d29dde9d3e4f93d1d0f2cbc9c UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr19 LN:59128983 M5:1aacd71f30db8e561810913e0b72636d UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr20 LN:63025520 M5:0dec9660ec1efaaf33281c0d5ea2560f UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr21 LN:48129895 M5:2979a6085bfe28e3ad6f552f361ed74d UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chr22 LN:51304566 M5:a718acaa6135fdca8357d5bfe94211dd UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chrX LN:155270560 M5:7e0e2e580297b7764e31dbc80c2540dd UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chrY LN:59373566 M5:1fa3474750af0948bdf97d5a0ee52e51 UR:file:/home/jgibert/hg19_IOT.fasta
    @SQ SN:chrM LN:16569 M5:c68f52674c9fb33aef52dcf399755519 UR:file:/home/jgibert/hg19_IOT.fasta
    @RG ID:IonXpress_001 LB:LIB PL:ILLUMINA SM:IonXpress_001 PU:JG
    @PG ID:bwa PN:bwa VN:0.7.17-r1188 CL:bwa mem /users/genomics/jgibert/hg19_IOT/hg19_IOT.fasta IonXpress_001.fastq
    @PG ID:MarkDuplicates VN:2.2.4(920e3247c340720b009f2398c1b93cce132c9bed_1461793281) CL:picard.sam.markduplicates.MarkDuplicates INPUT=[/users/genomics/jgibert/data/TML_Pedro_samples/TML_fastq/IonXpress_001_m.bam] OUTPUT=/users/genomics/jgibert/data/TML_Pedro_samples/TML_fastq/IonXpress_001_MD.bam METRICS_FILE=/users/genomics/jgibert/data/TML_Pedro_samples/TML_fastq/IonXpress_001_MD.bam.txt ASSUME_SORT_ORDER=queryname OPTICAL_DUPLICATE_PIXEL_DISTANCE=2500 MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 REMOVE_SEQUENCING_DUPLICATES=false TAGGING_POLICY=DontTag REMOVE_DUPLICATES=false ASSUME_SORTED=false DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates READ_NAME_REGEX= VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json PN:MarkDuplicates PP:bwa

    As you can see in the @SQ the reference genome is the same for both.

    All my files are single end data. Should I try to run Mutect2 without doing the MergeBamAlignment?

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭


    Hmm, it's not clear what is going on. There were some regressions in the Mutect2 code a month ago, so either try to use a different version of GATK or try a sans-MergeBamAlignment preprocessed BAM. Sorry I cannot be more helpful.

  • What I tried so far:

    1) GATK v4.0.4.0 and GATK v4.0.8.1
    2) Mapping with bwa mem and tmap (IOT aligner)

    I just performed the aligment and, with samtools view I got the .bam file. Then AddOrReplaceReadGroups, samtools sort and samtools index.

    Running Mutect2 throws exactly the same error that differs in the position between bwa mem and tmap.

    Any other idea? :'(

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @JoanGibert,

    Ok, in this case, it will be best for us to take a look at your data. Do you mind sending us a snippet of your data that recapitulates the error? You can upload to our bug report FTP site using directions in https://software.broadinstitute.org/gatk/guide/article?id=1894.

  • Hi @shlee,

    I uploaded a zip file named Joan_GIbert_data with all the information that you need to reproduce the error. I uploaded the fastq file and I align with the reference genome (hg19_IOT.fasta) with bwa mem v0.7.17.
    As I told you before, I followed the preprocessing posted here (https://software.broadinstitute.org/gatk/documentation/article?id=8017). I also uploaded the af-only-gnomad.raw.sites.h19.vcf.gz in another .zip file (Joan_Gibert_af-only-gnomad.raw.sites.h19.vcf.zip).

    I got some errors during the uploading. Please, tell me if you could find the uploaded files.

    Hope this would be enough to reproduce the error.

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭
    edited October 2018

    Hi @JoanGibert,

    I can see your uploads to our FTP server. However, the permissions on the files (-rwx------, view using ls -ltrh) do not enable me to access them. Can you chmod on them to enable reading by group users, e.g. with chmod 644 file (gives -rw-r--r--), before uploading? Also, I just want to confirm that you are including the reference you are using, as it seem specialized. Thanks.

    Post edited by shlee on
  • Hi @shlee,

    I uploaded a zip file named Joan_GIbert_data with all the information that you need to reproduce the error. I uploaded the fastq file and I align with the reference genome (hg19_IOT.fasta) with bwa mem v0.7.17.
    As I told you before, I followed the preprocessing posted here (https://software.broadinstitute.org/gatk/documentation/article?id=8017). I also uploaded the af-only-gnomad.raw.sites.h19.vcf.gz in another .zip file (Joan_Gibert_af-only-gnomad.raw.sites.h19.vcf.zip).

    I got some errors during the uploading. Please, tell me if you could find the uploaded files.

    Hope this would be enough to reproduce the error.

    Issue · Github
    by shlee

    Issue Number
    Last Updated
    Closed By
  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Sorry @JoanGibert, but it appears the permissions are identical to before, which disallow our access. We are having IT on our side change this for us so we can look into your data.

  • Ok thanks. Please, let me know if you need anything else.

  • Hi @shlee, any updates on this? :smile:


  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @JoanGibert, thanks for your patience. I will be getting to your data next week.

  • vivekruhelavivekruhela Member

    I am trying to get somatic calls from tumor and normal samples using mutect2 as mentioned above tutorial. But I am getting error statement as Reference coordinate corresponds to a non-existent base in the read.. But I generated the tumor.bam and normal.bam using the same pipeline [same version of BWA + picard + samtools]. I am not getting the meaning of the error. I have attached the error message here [Check line No. 42 in attached file]. Kindly suggest. Thanks.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @vivekruhela

    Please verify the following:
    1) Is the reference build that you used to align the reads match the reference build used in mutect2
    2) Please validate you bam file using this tool https://software.broadinstitute.org/gatk/documentation/tooldocs/current/picard_sam_ValidateSamFile.php
    3) Please post the GATK versions and the exact commands you used for preprocessing.

Sign In or Register to comment.