I/O error loading or writing tribble index file for VCF file

manomathewmanomathew MarseilleMember
edited June 2014 in Ask the GATK team

I am performing analysis on mm9 (mouse).
Downloaded VCF file from UCSC.
Removed chr*_random and chrM
Used vcftool v4.0
prepared two files ONLY INDEL and ONLY SNPS vcf files (DBINDEL, DBSNP)
Please kindly suggest me what needs to be done in this case?
Command used
java -Xmx30g -XX:+UseGCOverheadLimit -jar GenomeAnalysisTK.3.1-1-g07a4bf8.jar --log_to_file gatk.realigned.fixed.log --performanceLog gatk.realigned.fixed.perflog --keep_program_records -T BaseRecalibrator -R REFERENCE -I INPUTBAMNAME.realigned.fixed.bam -o OUTPUT_table --knownSites DBINDEL --knownSites DBSNP --deletions_default_quality 45 --insertions_default_quality 45 --low_quality_tail 2 --solid_nocall_strategy LEAVE_READ_UNRECALIBRATED --solid_recal_mode SET_Q_ZERO --lowMemoryMode --no_standard_covs --bqsrBAQGapOpenPenalty 30.0

some times I get this error:
NFO 12:42:13,747 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.1-1-g07a4bf8, Compiled 2014/03/18 06:09:21 INFO 12:42:13,748 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 12:42:13,748 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 12:42:13,752 HelpFormatter - Program Args: --log_to_file /media/NAS1/PFlab/Mano/AllBAM/GATK/QualityScore/H3K4me1_run22_F3_5.sorted.marked.gatk.realigned.fixed.log --performanceLog /media/NAS1/PFlab/Mano/AllBAM/GATK/QualityScore/H3K4me1_run22_F3_5.sorted.marked.gatk.realigned.fixed.perflog --keep_program_records -T BaseRecalibrator -R /media/NAS1/PFlab/Mano/Genome/reference.fa -I /media/NAS1/PFlab/Mano/AllBAM/GATK/H3K4me1_run22_F3_5.sorted.marked.intervals.realigned.fixed.bam -o /media/NAS1/PFlab/Mano/AllBAM/GATK/QualityScore/H3K4me1_run22_F3_5.sorted.marked.intervals.realigned.after_recal_data.table --knownSites /media/NAS1/PFlab/Mano/Fwithoutchrsorted.vcf --knownSites /media/NAS1/PFlab/Mano/New.SNPs_only_withoutCHR_sort.vcf --deletions_default_quality 45 --insertions_default_quality 45 --low_quality_tail 2 --solid_nocall_strategy LEAVE_READ_UNRECALIBRATED --solid_recal_mode SET_Q_ZERO --lowMemoryMode --no_standard_covs --bqsrBAQGapOpenPenalty 30.0 INFO 12:42:13,756 HelpFormatter - Executing as mano@balrog01 on Linux 3.2.0-27-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_51-b13. INFO 12:42:13,756 HelpFormatter - Date/Time: 2014/06/03 12:42:13
INFO 12:42:13,756 HelpFormatter - --------------------------------------------------------------------------------
INFO 12:42:13,757 HelpFormatter - --------------------------------------------------------------------------------
INFO 12:42:14,556 GenomeAnalysisEngine - Strictness is SILENT INFO 12:42:14,634 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO 12:42:14,643 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 12:42:14,687 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.04 INFO 12:42:15,979 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace java.lang.RuntimeException: java.lang.reflect.InvocationTargetException at org.broad.tribble.index.IndexFactory.loadIndex(IndexFactory.java:171) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.loadFromDisk(RMDTrackBuilder.java:324) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.attemptToLockAndLoadIndexFromDisk(RMDTrackBuilder.java:308) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.loadIndex(RMDTrackBuilder.java:267) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.getFeatureSource(RMDTrackBuilder.java:213) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.createInstanceOfTrack(RMDTrackBuilder.java:140) at org.broadinstitute.sting.gatk.datasources.rmd.ReferenceOrderedQueryDataPool. (ReferenceOrderedDataSource.java:208) at org.broadinstitute.sting.gatk.datasources.rmd.ReferenceOrderedDataSource. (ReferenceOrderedDataSource.java:88) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.getReferenceOrderedDataSources(GenomeAnalysisEngine.java:973) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.initializeDataSources(GenomeAnalysisEngine.java:767) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:284) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:121) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:248) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:155) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:107) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.broad.tribble.index.IndexFactory.loadIndex(IndexFactory.java:167) ... 14 more Caused by: java.io.EOFException at org.broad.tribble.util.LittleEndianInputStream.readFully(LittleEndianInputStream.java:134) at org.broad.tribble.util.LittleEndianInputStream.readLong(LittleEndianInputStream.java:76) at org.broad.tribble.index.linear.LinearIndex$ChrIndex.read(LinearIndex.java:268) at org.broad.tribble.index.AbstractIndex.read(AbstractIndex.java:359) at org.broad.tribble.index.linear.LinearIndex. (LinearIndex.java:98) ... 19 more
ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.1-1-g07a4bf8):
ERROR ##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem. ##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR ##### ERROR MESSAGE: java.lang.reflect.InvocationTargetException
ERROR ------------------------------------------------------------------------------------------

Current error:
INFO 13:49:09,583 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.1-1-g07a4bf8, Compiled 2014/03/18 06:09:21
INFO 13:49:09,583 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 13:49:09,583 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 13:49:09,588 HelpFormatter - Program Args: --log_to_file /home/mano/media/NAS1/PFlab/Mano/AllBAM/GATK/QualityScore/H3K4me1_run22_F3_5.sorted.marked.gatk.realigned.fixed.log --performanceLog /home/mano/media/NAS1/PFlab/Mano/AllBAM/GATK/QualityScore/H3K4me1_run22_F3_5.sorted.marked.gatk.realigned.fixed.perflog --keep_program_records -T BaseRecalibrator -R /home/mano/media/NAS1/PFlab/Mano/Genome/reference.fa -I /home/mano/media/NAS1/PFlab/Mano/AllBAM/GATK/H3K4me1_run22_F3_5.sorted.marked.intervals.realigned.fixed.bam -o /home/mano/media/NAS1/PFlab/Mano/AllBAM/GATK/QualityScore/H3K4me1_run22_F3_5.sorted.marked.intervals.realigned.after_recal_data.table --knownSites /home/mano/media/NAS1/PFlab/Mano/Fwithoutchrsorted.vcf --knownSites /home/mano/media/NAS1/PFlab/Mano/SNPs_only_withoutCHR_sort.vcf --deletions_default_quality 45 --insertions_default_quality 45 --low_quality_tail 2 --solid_nocall_strategy LEAVE_READ_UNRECALIBRATED --solid_recal_mode SET_Q_ZERO --lowMemoryMode --no_standard_covs --bqsrBAQGapOpenPenalty 30.0 INFO 13:49:09,591 HelpFormatter - Executing as mano@balrog04 on Linux 3.2.0-26-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_51-b13.
INFO 13:49:09,592 HelpFormatter - Date/Time: 2014/06/02 13:49:09
INFO 13:49:09,592 HelpFormatter - --------------------------------------------------------------------------------
INFO 13:49:09,592 HelpFormatter - --------------------------------------------------------------------------------
INFO 13:49:10,461 GenomeAnalysisEngine - Strictness is SILENT
INFO 13:49:10,538 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO 13:49:10,547 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 13:49:10,574 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.03
INFO 13:50:17,644 RMDTrackBuilder - Writing Tribble index to disk for file /home/mano/media/NAS1/PFlab/Mano/SNPs_only_withoutCHR_sort.vcf.idx
INFO 13:50:29,885 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 3.1-1-g07a4bf8):
ERROR ##### ERROR This means that one or more arguments or inputs in your command are incorrect.
ERROR The error message below tells you what is the problem.
ERROR ##### ERROR If the problem is an invalid argument, please check the online documentation guide
ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
ERROR ##### ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR ##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
ERROR ##### ERROR MESSAGE: I/O error loading or writing tribble index file for /home/mano/media/NAS1/PFlab/Mano/SNPs_only_withoutCHR_sort.vcf
ERROR ------------------------------------------------------------------------------------------

Please note that .idx file is formed but only in Kilobytes

Best Answer

Answers

  • manomathewmanomathew MarseilleMember

    It is consistent. I am using ubuntu 12.04.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @manomathew‌

    Hello,

    Another question. Does this happen with any file or just this one in particular?

    Thanks,
    Sheila

  • manomathewmanomathew MarseilleMember

    I have total 6 files out of which 1 worked fine with normal table creation.
    The difference for the rest 5 files is that in the initial step I found that the contigs where 1,2,3,4...,X,Y not chr1,chr2,chr3
    I replaced chr using sed 's/^chr//g' to create this vcf file. Please note that I removed the chr*_random and chrM lines.
    I did this random removal for the 1 file which worked fine. But after I removed chr I get this error. when i process the file in vcftool it process perfectly without any errors.

  • manomathewmanomathew MarseilleMember

    I did the same thing but it didnt work i saw all the other clues which your team mates suggested but it didnt work. so thought to put up this question

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @manomathew‌

    Hi,

    There is also a possibility that you have messed up your vcf file when you tried to edit it directly. Notice that in the stack trace, it says "java.io.EOFException". This is typically caused by a truncated file. It may be that whatever you used to generate the file didn't complete successfully.

    -Sheila

  • manomathewmanomathew MarseilleMember

    Hi,
    Firstly thank you for your suggestions. I checked the vcf file using vcftools v4.0. It recoded well without any errors. I kept a note when its processing the file and found that it was created properly best to my knowledge. If you wish I could share you the link with all the needful files to just test if it works at your end? Or can you suggest vcf file for mm9 without chr, chrM and random how should i prepare it?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @manomathew‌

    Hi,

    Unfortunately, we do not support non-human organisms to that level. One thing you can try is to post this question to SeqAnswers.com.

    -Sheila

Sign In or Register to comment.