We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
GATKException: Somehow the requested coordinate is not covered by the read.
Hi!
I am analysing Ion Torrent targeted panel data. I follow the best practices pipeline (https://software.broadinstitute.org/gatk/documentation/article?id=11136) but I got this error when I run Mutect2 in these files.
12:19:15.515 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/soft/EB_repo/bio/sequence/programs/foss/2016b/GATK/4.0.4.0/gatk-package-4.0.4.0-local.jar!/com/intel/gkl/native/libgkl_compression.so 12:19:17.186 INFO Mutect2 - ------------------------------------------------------------ 12:19:17.187 INFO Mutect2 - The Genome Analysis Toolkit (GATK) v4.0.4.0 12:19:17.188 INFO Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/ 12:19:17.189 INFO Mutect2 - Executing as [email protected] on Linux v3.10.0-693.21.1.el7.x86_64 amd64 12:19:17.192 INFO Mutect2 - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_92-b14 12:19:17.194 INFO Mutect2 - Start Date/Time: August 30, 2018 12:19:15 PM CEST 12:19:17.194 INFO Mutect2 - ------------------------------------------------------------ 12:19:17.194 INFO Mutect2 - ------------------------------------------------------------ 12:19:17.195 INFO Mutect2 - HTSJDK Version: 2.14.3 12:19:17.196 INFO Mutect2 - Picard Version: 2.18.2 12:19:17.196 INFO Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2 12:19:17.197 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false 12:19:17.197 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true 12:19:17.198 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false 12:19:17.198 INFO Mutect2 - Deflater: IntelDeflater 12:19:17.199 INFO Mutect2 - Inflater: IntelInflater 12:19:17.199 INFO Mutect2 - GCS max retries/reopens: 20 12:19:17.199 INFO Mutect2 - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes 12:19:17.200 INFO Mutect2 - Initializing engine 12:19:19.295 INFO FeatureManager - Using codec VCFCodec to read file file:///users/genomics/jgibert/Varis/af-only-gnomad.raw.sites.h19.vcf.gz 12:19:20.028 INFO FeatureManager - Using codec BEDCodec to read file file:///users/genomics/jgibert/Varis/Oncomine_TML.20170222.designed.bed 12:19:21.127 INFO IntervalArgumentCollection - Processing 1656280 bp from intervals 12:19:21.281 INFO Mutect2 - Done initializing engine 12:19:22.708 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/soft/EB_repo/bio/sequence/programs/foss/2016b/GATK/4.0.4.0/gatk-package-4.0.4.0-local.jar!/com/intel/gkl/native/libgkl_utils.so 12:19:22.928 INFO PairHMM - OpenMP multi-threaded AVX-accelerated native PairHMM implementation is not supported 12:19:22.929 WARN PairHMM - ***WARNING: Machine does not have the AVX instruction set support needed for the accelerated AVX PairHmm. Falling back to the MUCH slower LOGLESS_CACHING implementation! 12:19:23.836 INFO ProgressMeter - Starting traversal 12:19:23.843 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute 12:19:47.129 INFO ProgressMeter - chr1:6529491 0.4 20 51.5 12:19:56.279 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 17.687161069000002 12:19:56.279 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 4.22 sec 12:19:56.864 INFO Mutect2 - Shutting down engine [August 30, 2018 12:19:56 PM CEST] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.69 minutes. Runtime.totalMemory()=1211629568 org.broadinstitute.hellbender.exceptions.GATKException: Somehow the requested coordinate is not covered by the read. Alignment 6530911 | 55M1I36M at org.broadinstitute.hellbender.utils.read.ReadUtils.getReadCoordinateForReferenceCoordinate(ReadUtils.java:815) at org.broadinstitute.hellbender.utils.read.ReadUtils.getReadCoordinateForReferenceCoordinate(ReadUtils.java:691) at org.broadinstitute.hellbender.utils.read.ReadUtils.getReadCoordinateForReferenceCoordinateUpToEndOfRead(ReadUtils.java:649) at org.broadinstitute.hellbender.tools.walkers.annotator.BaseQualityRankSumTest.getReadBaseQuality(BaseQualityRankSumTest.java:43) at org.broadinstitute.hellbender.tools.walkers.annotator.BaseQualityRankSumTest.getElementForRead(BaseQualityRankSumTest.java:38) at org.broadinstitute.hellbender.tools.walkers.annotator.RankSumTest.getElementForRead(RankSumTest.java:108) at org.broadinstitute.hellbender.tools.walkers.annotator.RankSumTest.fillQualsFromLikelihood(RankSumTest.java:86) at org.broadinstitute.hellbender.tools.walkers.annotator.RankSumTest.annotate(RankSumTest.java:44) at org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:377) at org.broadinstitute.hellbender.tools.walkers.mutect.SomaticGenotypingEngine.callMutations(SomaticGenotypingEngine.java:167) at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.callRegion(Mutect2Engine.java:182) at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2.apply(Mutect2.java:183) at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:295) at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:271) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:892) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:134) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:179) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:198) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203) at org.broadinstitute.hellbender.Main.main(Main.java:289) Using GATK jar /soft/EB_repo/bio/sequence/programs/foss/2016b/GATK/4.0.4.0/gatk-package-4.0.4.0-local.jar Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /soft/EB_repo/bio/sequence/programs/foss/2016b/GATK/4.0.4.0/gatk-package-4.0.4.0-local.jar Mutect2 --af-of-alleles-not-in-resource 0.0000025 --germline-resource /users/genomics/jgibert/Varis/af-only-gnomad.raw.sites.h19.vcf.gz -A BaseQualityRankSumTest -A ClippingRankSumTest -A MappingQualityRankSumTest -A QualByDepth -A RMSMappingQuality -A ReadPosRankSumTest -A FisherStrand -A StrandOddsRatio -A InbreedingCoeff -I IonXpress_001.bam -R /users/genomics/jgibert/hg19_IOT/hg19_IOT.fasta -tumor IonXpress_001 -L /users/genomics/jgibert/Varis/Oncomine_TML.20170222.designed.bed -bamout /users/genomics/jgibert/data/TML_Pedro_samples/GATK/IonXpress_001_Mutect.bam -O /users/genomics/jgibert/data/TML_Pedro_samples/GATK/IonXpress_001_Mutect.vcf.gz
In other files I also get different error: GATKException: Reference coordinate corresponds to a non-existent base in the read.
I tried to use the last version of GATK but the error still persists. Any idea what's happening?
Thanks
Best Answer
-
shlee Cambridge ✭✭✭✭✭
Hi @JoanGibert,
I have looked into your data and Mutect2 runs successfully without error, so it's unclear what is going on on your side. I used bwa mem v0.7.16a-r1181 and GATK v4.0.10.0. Here are the steps I took to test and perhaps it can help you debug.
1. Align
You are using v0.7.17. I used bwa mem v0.7.16a-r1181 because this is what I have on my system.
2. Add read group information
3. Coordinate sort
4. Short variant detection with M2
...I can confirm there are Mutect2 variant calls in the VCF.
Answers
Hi @JoanGibert,
For starters, you should know our workflows do not support Ion Torrent data. Is there something different or special about your data than traditional Ion Torrent? Or are you performing QC diagnostics on your data?
You can run ValidateSamFile on your BAMs to diagnose the issue, which I suspect relates to mismatches in expectations for CIGAR, read coordinates and/or reference coordinates. These mismatches in expectations arise from using tool-chains other than those recommended by our Best Practices. For example, we recommend alignment with BWA MEM.
Hi @shlee,
Thanks for the quick answer. I am aware that GATK do not suppor IOT data, however this is just some kind of exploratory. Actually, I performed different analysis with IOT target panel data that give me good results.
That's why I'm kind of surprised with this error. This is the first time that IOT data throws me it. I followed the best practices and I used BWA MEM for mapping. Moreover, ValidateSamFile do not show me any error, just a warning of the .bai due to cluster timmings:
java -jar $EBROOTPICARD/picard.jar ValidateSamFile I=IonXpress_001_snaut.bam
10:28:08.631 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/soft/EB_repo/bio/sequence/programs/noarch/picard/2.18.12-Java-1.8.0_92/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Thu Sep 27 10:28:08 CEST 2018] ValidateSamFile INPUT=50166_IonXpress_001_snaut.bam MODE=VERBOSE MAX_OUTPUT=100 IGNORE_WARNINGS=false VALIDATE_INDEX=true INDEX_VALIDATION_STRINGENCY=EXHAUSTIVE IS_BISULFITE_SEQUENCED=false MAX_OPEN_TEMP_FILES=8000 SKIP_MATE_VALIDATION=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Thu Sep 27 10:28:08 CEST 2018] Executing as [email protected] on Linux 3.10.0-693.5.2.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.18.12-SNAPSHOT
WARNING: BAM index file /users/genomics/jgibert/data/TML_Pedro_samples/TML_fastq/IonXpress_001_snaut.bai is older than BAM /users/genomics/jgibert/data/TML_Pedro_samples/TML_fastq/IonXpress_001_snaut.bam
No errors found
[Thu Sep 27 10:30:03 CEST 2018] picard.sam.ValidateSamFile done. Elapsed time: 1.92 minutes.
Runtime.totalMemory()=393740288
Best
@JoanGibert,
Given:
Would you mind posting the record with alignment start 6530911 and CIGAR string 55M1I36M? What we want to confirm is that the length outlined by the CIGAR string matches the read alignment. The error message implies that these do not match.
Also, can you tell us something about the pre-processing of your data? For example, I see the label
snaut
on your BAM which is a label I've used in the past to indicate processing through SortSam and SetNmAndUqTags/SetNmMdAndUqTags. Did you by chance clip (soft or hard) your reads at any point during your pre-processing?Hi @shlee,
Here is the aligment record that you request:
Regarding the preprocessing, I followed the pipeline posted here: https://software.broadinstitute.org/gatk/documentation/article?id=8017
In brief: BWA mapping, RevertSam, AddOrReplaceReadGroups, MergeBamAlignment, MarkDuplicates, SortSam, SetNmAndUqTags, BaseRecalibrator, ApplyBQSR and Mutect2
In the MergeBamAlignment stem I specified CLIP_ADAPTERS=false
Sorry in advance if I misunderstood something in the pipeline and thanks for your help
Hi @JoanGibert,
Nothing is amiss with the read metadata. Looks like you are working with single-ended reads. Bases and base qualities each add up to 92, which your cigar string elements also add up to. The read aligns to the middle of human chromosome 1, away from edges. Finally, this read is marked as a duplicate.
Given some of your other files give the error:
And you are using a custom reference in your Mutect2 command:
Is there any possibility you are using a reference that is different from the reference you used in BWA mapping? You can check your
@SQ
header lines in the BAM against the reference dictionary to confirm.Another option that comes to mind is to see if Mutect2 still gives the error for a pre-processed BAM that hasn't undergone MergeBamAlignment. Are all of your reads single ended or do you also have paired end reads?
Hi @shlee,
Here is the header of the bam file:
@HD VN:1.5 SO:coordinate
@SQ SN:chr1 LN:249250621 M5:1b22b98cdeb4a9304cb5d48026a85128 UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr2 LN:243199373 M5:a0d9851da00400dec1098a9255ac712e UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr3 LN:198022430 M5:fdfd811849cc2fadebc929bb925902e5 UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr4 LN:191154276 M5:23dccd106897542ad87d2765d28a19a1 UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr5 LN:180915260 M5:0740173db9ffd264d728f32784845cd7 UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr6 LN:171115067 M5:1d3a93a248d92a729ee764823acbbc6b UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr7 LN:159138663 M5:618366e953d6aaad97dbe4777c29375e UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr8 LN:146364022 M5:96f514a9929e410c6651697bded59aec UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr9 LN:141213431 M5:3e273117f15e0a400f01055d9f393768 UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr10 LN:135534747 M5:988c28e000e84c26d552359af1ea2e1d UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr11 LN:135006516 M5:98c59049a2df285c76ffb1c6db8f8b96 UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr12 LN:133851895 M5:51851ac0e1a115847ad36449b0015864 UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr13 LN:115169878 M5:283f8d7892baa81b510a015719ca7b0b UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr14 LN:107349540 M5:98f3cae32b2a2e9524bc19813927542e UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr15 LN:102531392 M5:e5645a794a8238215b2cd77acb95a078 UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr16 LN:90354753 M5:fc9b1a7b42b97a864f56b348b06095e6 UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr17 LN:81195210 M5:351f64d4f4f9ddd45b35336ad97aa6de UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr18 LN:78077248 M5:b15d4b2d29dde9d3e4f93d1d0f2cbc9c UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr19 LN:59128983 M5:1aacd71f30db8e561810913e0b72636d UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr20 LN:63025520 M5:0dec9660ec1efaaf33281c0d5ea2560f UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr21 LN:48129895 M5:2979a6085bfe28e3ad6f552f361ed74d UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chr22 LN:51304566 M5:a718acaa6135fdca8357d5bfe94211dd UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chrX LN:155270560 M5:7e0e2e580297b7764e31dbc80c2540dd UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chrY LN:59373566 M5:1fa3474750af0948bdf97d5a0ee52e51 UR:file:/home/jgibert/hg19_IOT.fasta
@SQ SN:chrM LN:16569 M5:c68f52674c9fb33aef52dcf399755519 UR:file:/home/jgibert/hg19_IOT.fasta
@RG ID:IonXpress_001 LB:LIB PL:ILLUMINA SM:IonXpress_001 PU:JG
@PG ID:bwa PN:bwa VN:0.7.17-r1188 CL:bwa mem /users/genomics/jgibert/hg19_IOT/hg19_IOT.fasta IonXpress_001.fastq
@PG ID:MarkDuplicates VN:2.2.4(920e3247c340720b009f2398c1b93cce132c9bed_1461793281) CL:picard.sam.markduplicates.MarkDuplicates INPUT=[/users/genomics/jgibert/data/TML_Pedro_samples/TML_fastq/IonXpress_001_m.bam] OUTPUT=/users/genomics/jgibert/data/TML_Pedro_samples/TML_fastq/IonXpress_001_MD.bam METRICS_FILE=/users/genomics/jgibert/data/TML_Pedro_samples/TML_fastq/IonXpress_001_MD.bam.txt ASSUME_SORT_ORDER=queryname OPTICAL_DUPLICATE_PIXEL_DISTANCE=2500 MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 REMOVE_SEQUENCING_DUPLICATES=false TAGGING_POLICY=DontTag REMOVE_DUPLICATES=false ASSUME_SORTED=false DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates READ_NAME_REGEX= VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json PN:MarkDuplicates PP:bwa
As you can see in the
@SQ
the reference genome is the same for both.All my files are single end data. Should I try to run Mutect2 without doing the MergeBamAlignment?
@JoanGibert,
Hmm, it's not clear what is going on. There were some regressions in the Mutect2 code a month ago, so either try to use a different version of GATK or try a sans-MergeBamAlignment preprocessed BAM. Sorry I cannot be more helpful.
What I tried so far:
1) GATK v4.0.4.0 and GATK v4.0.8.1
2) Mapping with bwa mem and tmap (IOT aligner)
I just performed the aligment and, with samtools view I got the .bam file. Then AddOrReplaceReadGroups, samtools sort and samtools index.
Running Mutect2 throws exactly the same error that differs in the position between bwa mem and tmap.
Any other idea?
Hi @JoanGibert,
Ok, in this case, it will be best for us to take a look at your data. Do you mind sending us a snippet of your data that recapitulates the error? You can upload to our bug report FTP site using directions in https://software.broadinstitute.org/gatk/guide/article?id=1894.
Hi @shlee,
I uploaded a zip file named Joan_GIbert_data with all the information that you need to reproduce the error. I uploaded the fastq file and I align with the reference genome (hg19_IOT.fasta) with bwa mem v0.7.17.
As I told you before, I followed the preprocessing posted here (https://software.broadinstitute.org/gatk/documentation/article?id=8017). I also uploaded the af-only-gnomad.raw.sites.h19.vcf.gz in another .zip file (Joan_Gibert_af-only-gnomad.raw.sites.h19.vcf.zip).
I got some errors during the uploading. Please, tell me if you could find the uploaded files.
Hope this would be enough to reproduce the error.
Hi @JoanGibert,
I can see your uploads to our FTP server. However, the permissions on the files (
-rwx------
, view usingls -ltrh
) do not enable me to access them. Can you chmod on them to enable reading by group users, e.g. withchmod 644 file
(gives-rw-r--r--
), before uploading? Also, I just want to confirm that you are including the reference you are using, as it seem specialized. Thanks.Hi @shlee,
I uploaded a zip file named Joan_GIbert_data with all the information that you need to reproduce the error. I uploaded the fastq file and I align with the reference genome (hg19_IOT.fasta) with bwa mem v0.7.17.
As I told you before, I followed the preprocessing posted here (https://software.broadinstitute.org/gatk/documentation/article?id=8017). I also uploaded the af-only-gnomad.raw.sites.h19.vcf.gz in another .zip file (Joan_Gibert_af-only-gnomad.raw.sites.h19.vcf.zip).
I got some errors during the uploading. Please, tell me if you could find the uploaded files.
Hope this would be enough to reproduce the error.
Issue · Github
by shlee
Sorry @JoanGibert, but it appears the permissions are identical to before, which disallow our access. We are having IT on our side change this for us so we can look into your data.
Ok thanks. Please, let me know if you need anything else.
Hi @shlee, any updates on this?
Thanks!
Hi @JoanGibert, thanks for your patience. I will be getting to your data next week.
Hi @JoanGibert,
I have looked into your data and Mutect2 runs successfully without error, so it's unclear what is going on on your side. I used bwa mem v0.7.16a-r1181 and GATK v4.0.10.0. Here are the steps I took to test and perhaps it can help you debug.
1. Align
You are using v0.7.17. I used bwa mem v0.7.16a-r1181 because this is what I have on my system.

2. Add read group information
3. Coordinate sort
4. Short variant detection with M2
...
I can confirm there are Mutect2 variant calls in the VCF.
I am trying to get somatic calls from tumor and normal samples using mutect2 as mentioned above tutorial. But I am getting error statement as Reference coordinate corresponds to a non-existent base in the read.. But I generated the tumor.bam and normal.bam using the same pipeline [same version of BWA + picard + samtools]. I am not getting the meaning of the error. I have attached the error message here [Check line No. 42 in attached file]. Kindly suggest. Thanks.
Hi @vivekruhela
Please verify the following:
1) Is the reference build that you used to align the reads match the reference build used in mutect2
2) Please validate you bam file using this tool https://software.broadinstitute.org/gatk/documentation/tooldocs/current/picard_sam_ValidateSamFile.php
3) Please post the GATK versions and the exact commands you used for preprocessing.