We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

ApplyRecalibration error: Unable to create iterator for rod named input

Dear GATK Team,

I'm trying to perform variant calling on targeted sequencing data for a cohort of 24 patients. I am following the pipeline described in Best Practices. I have generated individual g.vcf files for each patient and used GenotypeGVCFs to obtain multisample vcf for the complete cohort. But then I got stuck at the VQSR step. The ApplyRecalibration program throws the following error when I run VQSR on the complete vcf file:

command:

java -Xmx10g -jar GenomeAnalysisTK-3.6/GenomeAnalysisTK.jar \
-T ApplyRecalibration  \
-R ref/Homo_sapiens_assembly38.fasta \
-input joint_genotyping_20171029__raw_variants.vcf  \
-mode SNP --ts_filter_level 99.5 \
-recalFile joint_genotyping_20171029__recalibrate_SNP.recal \
-tranchesFile joint_genotyping_20171029__recalibrate_SNP.tranches \
-o joint_genotyping_20171029__recalibrated_snps_raw_indels.vcf \
-nt 24 \
--log_to_file joint_genotyping_20171029__ApplyRecalibration_SNPs.log

error:

INFO  11:30:18,346 GenomeAnalysisEngine - Strictness is SILENT
INFO  11:30:20,077 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO  11:30:21,180 MicroScheduler - Running the GATK in parallel mode with 24 total threads, 1 CPU thread(s) for each of 24 data thread(s), of 80 processors available on this machine
INFO  11:30:22,832 GenomeAnalysisEngine - Preparing for traversal
INFO  11:30:22,841 GenomeAnalysisEngine - Done preparing for traversal
INFO  11:30:22,842 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO  11:30:22,842 ProgressMeter -                 | processed |    time |    per 1M |           |   total | remaining
INFO  11:30:22,843 ProgressMeter -        Location |     sites | elapsed |     sites | completed | runtime |   runtime
INFO  11:30:22,878 ApplyRecalibration - Read tranche Tranche ts=90.00 minVQSLod=4.3250 known=(2422193 @ 2.0861) novel=(26191 @ 1.5524) truthSites(1710335 accessible, 1539301 called), name=VQSRTrancheSNP0.00to90.00]
INFO  11:30:22,879 ApplyRecalibration - Read tranche Tranche ts=99.00 minVQSLod=0.1196 known=(2869865 @ 2.0658) novel=(58005 @ 1.4888) truthSites(1710335 accessible, 1693231 called), name=VQSRTrancheSNP90.00to99.00]
INFO  11:30:22,879 ApplyRecalibration - Read tranche Tranche ts=99.90 minVQSLod=-1.0002 known=(2946811 @ 2.0655) novel=(58752 @ 1.4813) truthSites(1710335 accessible, 1708624 called), name=VQSRTrancheSNP99.00to99.90]
INFO  11:30:22,880 ApplyRecalibration - Read tranche Tranche ts=100.00 minVQSLod=-10390.3514 known=(2976941 @ 2.0588) novel=(72865 @ 1.3876) truthSites(1710335 accessible, 1710335 called), name=VQSRTrancheSNP99.90to100.00]
INFO  11:30:22,937 ApplyRecalibration - Keeping all variants in tranche Tranche ts=99.90 minVQSLod=-1.0002 known=(2946811 @ 2.0655) novel=(58752 @ 1.4813) truthSites(1710335 accessible, 1708624 called), name=VQSRTrancheSNP99.00to99.90]
INFO  11:30:52,855 ProgressMeter -  chr1:104968486    134509.0    30.0 s       3.7 m        3.3%    15.3 m      14.8 m
INFO  11:31:22,860 ProgressMeter -   chr2:61972603    378735.0    60.0 s       2.6 m        9.7%    10.3 m       9.3 m
INFO  11:31:52,863 ProgressMeter -  chr2:217912454    560624.0    90.0 s       2.7 m       14.5%    10.3 m       8.8 m
##### ERROR --
##### ERROR stack trace
org.broadinstitute.gatk.utils.exceptions.ReviewedGATKException: Unable to create iterator for rod named input
        at org.broadinstitute.gatk.engine.datasources.rmd.ReferenceOrderedQueryDataPool.createIteratorFromResource(ReferenceOrderedDataSource.java:248)
        at org.broadinstitute.gatk.engine.datasources.rmd.ReferenceOrderedQueryDataPool.createIteratorFromResource(ReferenceOrderedDataSource.java:185)
        at org.broadinstitute.gatk.engine.datasources.rmd.ResourcePool.iterator(ResourcePool.java:93)
        at org.broadinstitute.gatk.engine.datasources.rmd.ReferenceOrderedDataSource.seek(ReferenceOrderedDataSource.java:168)
        at org.broadinstitute.gatk.engine.datasources.providers.RodLocusView.<init>(RodLocusView.java:82)
        at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.getLocusView(TraverseLociNano.java:129)
        at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:80)
        at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
        at org.broadinstitute.gatk.engine.executive.ShardTraverser.call(ShardTraverser.java:98)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: htsjdk.samtools.util.RuntimeIOException: java.io.IOException: Bad file descriptor
        at htsjdk.tribble.readers.SynchronousLineReader.readLine(SynchronousLineReader.java:53)
        at htsjdk.tribble.readers.LineIteratorImpl.advance(LineIteratorImpl.java:24)
        at htsjdk.tribble.readers.LineIteratorImpl.advance(LineIteratorImpl.java:11)
        at htsjdk.samtools.util.AbstractIterator.hasNext(AbstractIterator.java:44)
        at htsjdk.tribble.AsciiFeatureCodec.isDone(AsciiFeatureCodec.java:48)
        at htsjdk.tribble.AsciiFeatureCodec.isDone(AsciiFeatureCodec.java:36)
        at htsjdk.tribble.TribbleIndexedFeatureReader$QueryIterator.readNextRecord(TribbleIndexedFeatureReader.java:469)
        at htsjdk.tribble.TribbleIndexedFeatureReader$QueryIterator.<init>(TribbleIndexedFeatureReader.java:412)
        at htsjdk.tribble.TribbleIndexedFeatureReader.query(TribbleIndexedFeatureReader.java:261)
        at org.broadinstitute.gatk.utils.refdata.tracks.RMDTrack.query(RMDTrack.java:119)
        at org.broadinstitute.gatk.engine.datasources.rmd.ReferenceOrderedQueryDataPool.createIteratorFromResource(ReferenceOrderedDataSource.java:241)
        ... 12 more
Caused by: java.io.IOException: Bad file descriptor
        at java.io.RandomAccessFile.readBytes(Native Method)
        at java.io.RandomAccessFile.read(RandomAccessFile.java:377)
        at htsjdk.samtools.seekablestream.SeekableFileStream.read(SeekableFileStream.java:80)
        at htsjdk.tribble.TribbleIndexedFeatureReader$BlockStreamWrapper.read(TribbleIndexedFeatureReader.java:562)
        at java.io.InputStream.read(InputStream.java:101)
        at htsjdk.tribble.readers.PositionalBufferedStream.fill(PositionalBufferedStream.java:127)
        at htsjdk.tribble.readers.PositionalBufferedStream.read(PositionalBufferedStream.java:79)
        at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
        at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
        at java.io.InputStreamReader.read(InputStreamReader.java:184)
        at htsjdk.tribble.readers.LongLineBufferedReader.fill(LongLineBufferedReader.java:140)
        at htsjdk.tribble.readers.LongLineBufferedReader.readLine(LongLineBufferedReader.java:298)
        at htsjdk.tribble.readers.LongLineBufferedReader.readLine(LongLineBufferedReader.java:354)
        at htsjdk.tribble.readers.SynchronousLineReader.readLine(SynchronousLineReader.java:51)
        ... 22 more
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 3.6-0-g89b7209):
##### ERROR
##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
##### ERROR Visit our website and forum for extensive documentation and answers to
##### ERROR commonly asked questions https://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Unable to create iterator for rod named input
##### ERROR ------------------------------------------------------------------------------------------

I have checked my input vcf with the ValidateVariants tool - it is correct. Vcftools vcf-validator complains about "*" alleles (e.g. "chr1:54815459 .. Could not parse the allele(s) [*]") but I think it is not the problem.

I tried to run ApplyRecalibration on subsets of my input vcf and I found that it works if I use first 25% of records or second 25% of records but fails if I submit 50% of records.

I would be very grateful for your help.

PS. I know that 24 samples is less than the recommended amount for VQSR in non-whole-genome sequencing data analysis but I wanted to check how it works and I think it should not be the cause for this error.

Best Answer

Answers

Sign In or Register to comment.