VCF index file corruption when using gatk-4.0.0.0

Recently, I noticed that the newest gatk release (4.0.0.0) is available now, so I just downloaded it for doing my jobs. However, when it came to BQSR steps, it returned messages including "Index file is corrupted", while the .idx files was exactly created by the IndexFeatureFile tool of gatk-4.0.0.0 (I adopted this way to generate the indices because gatk4 shew no tendency to automatically generate them for me). In addition, the reference and the VCF files are all in hg38 version and were all downloaded from broad's FTP server.
So What I'm asking is that: Is there any problem with the indices generating step? or Is it just a bug and you might recommend me go back to gatk3.x?
Codes follows here:
For index generating (of course similarly for GoldStandardIndel):
'''
$ ~/Library/gatk-4.0.0.0/gatk IndexFeatureFile -F dbsnp_146.hg38.vcf
'''
The head and end of the output of this step:
'''
Using GATK jar /public/home/wangych/Library/gatk-4.0.0.0/gatk-package-4.0.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -jar /public/home/wangych/Library/gatk-4.0.0.0/gatk-package-4.0.0.0-local.jar IndexFeatureFile -F dbsnp_146.hg38.vcf
02:01:11.115 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/public/home/wangych/Library/gatk-4.0.0.0/gatk-package-4.0.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
02:01:11.669 INFO IndexFeatureFile - ------------------------------------------------------------
02:01:11.669 INFO IndexFeatureFile - The Genome Analysis Toolkit (GATK) v4.0.0.0
02:01:11.669 INFO IndexFeatureFile - For support and documentation go to https://software.broadinstitute.org/gatk/
02:01:11.670 INFO IndexFeatureFile - Executing as wangych@HPC-login on Linux v2.6.32-696.3.2.el6.x86_64 amd64
02:01:11.670 INFO IndexFeatureFile - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_151-b12
02:01:11.670 INFO IndexFeatureFile - Start Date/Time: January 23, 2018 2:01:10 AM CST
02:01:11.670 INFO IndexFeatureFile - ------------------------------------------------------------
02:01:11.670 INFO IndexFeatureFile - ------------------------------------------------------------
02:01:11.670 INFO IndexFeatureFile - HTSJDK Version: 2.13.2
02:01:11.670 INFO IndexFeatureFile - Picard Version: 2.17.2
02:01:11.671 INFO IndexFeatureFile - HTSJDK Defaults.COMPRESSION_LEVEL : 1
02:01:11.671 INFO IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
02:01:11.671 INFO IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
02:01:11.671 INFO IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
02:01:11.671 INFO IndexFeatureFile - Deflater: IntelDeflater
02:01:11.671 INFO IndexFeatureFile - Inflater: IntelInflater
02:01:11.671 INFO IndexFeatureFile - GCS max retries/reopens: 20
02:01:11.671 INFO IndexFeatureFile - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
02:01:11.671 INFO IndexFeatureFile - Initializing engine
02:01:11.671 INFO IndexFeatureFile - Done initializing engine
02:01:12.156 INFO FeatureManager - Using codec VCFCodec to read file file:///public/home/wangych/Files/SingleCell/Samples/Database/broad_data_hg19/dbsnp_146.hg38.vcf
02:01:12.201 INFO ProgressMeter - Starting traversal
02:01:12.202 INFO ProgressMeter - Current Locus Elapsed Minutes Records Processed Records/Minute
02:01:22.238 INFO ProgressMeter - chr1:41338702 0.2 2288000 13725254.9
......
02:11:32.357 INFO ProgressMeter - chrX:102818927 10.3 146978000 14220603.9
02:11:40.441 INFO ProgressMeter - chrM:15257 10.5 149125207 14245567.0
02:11:40.444 INFO ProgressMeter - Traversal complete. Processed 149125207 total records in 10.5 minutes.
02:11:41.708 INFO IndexFeatureFile - Successfully wrote index to /public/home/wangych/Files/SingleCell/Samples/Database/broad_data_hg19/dbsnp_146.hg38.vcf.idx
02:11:41.708 INFO IndexFeatureFile - Shutting down engine
[January 23, 2018 2:11:41 AM CST] org.broadinstitute.hellbender.tools.IndexFeatureFile done. Elapsed time: 10.51 minutes.
Runtime.totalMemory()=1856503808
Tool returned:
/public/home/wangych/Files/SingleCell/Samples/Database/broad_data_hg19/dbsnp_146.hg38.vcf.idx
'''
The code for recalibration table generating:
'''
~/Library/gatk-4.0.0.0/gatk BaseRecalibrator -R ../../Database/broad_data_hg19/Homo_sapiens_assembly38.fasta -I SRR2973272_MD.bam --known-sites ../../Database/broad_data_hg19/dbsnp_146.h
g38.vcf --known-sites ../../Database/broad_data_hg19/Mills_and_1000G_gold_standard.indels.hg38.vcf -O SRR2973272_recal.table
'''
And what this returned:
'''
Using GATK jar /public/home/wangych/Library/gatk-4.0.0.0/gatk-package-4.0.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -jar /public/home/wangych
/Library/gatk-4.0.0.0/gatk-package-4.0.0.0-local.jar BaseRecalibrator -R ../../Database/broad_data_hg19/Homo_sapiens_assembly38.fasta -I SRR2973272_MD.bam --known-sites ../../Database/broad
_data_hg19/dbsnp_146.hg38.vcf --known-sites ../../Database/broad_data_hg19/Mills_and_1000G_gold_standard.indels.hg38.vcf -O SRR2973272_recal.table
[wangych@HPC-login SNP_calling]$
[wangych@HPC-login SNP_calling]$
[wangych@HPC-login SNP_calling]$ more BQSR_1.e85543
Using GATK jar /public/home/wangych/Library/gatk-4.0.0.0/gatk-package-4.0.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -jar /public/home/wangych
/Library/gatk-4.0.0.0/gatk-package-4.0.0.0-local.jar BaseRecalibrator -R ../../Database/broad_data_hg19/Homo_sapiens_assembly38.fasta -I SRR2973272_MD.bam --known-sites ../../Database/broad
_data_hg19/dbsnp_146.hg38.vcf --known-sites ../../Database/broad_data_hg19/Mills_and_1000G_gold_standard.indels.hg38.vcf -O SRR2973272_recal.table
[January 23, 2018 2:12:42 AM CST] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.46 minutes.
Runtime.totalMemory()=1755840512
htsjdk.tribble.TribbleException$CorruptedIndexFile: Index file is corrupted, for input source: file:///public/home/wangych/Files/SingleCell/Samples/BC01/SNP_calling/../../Database/broad_dat
a_hg19/dbsnp_146.hg38.vcf.idx
at htsjdk.tribble.index.IndexFactory.loadIndex(IndexFactory.java:188)
at htsjdk.tribble.TribbleIndexedFeatureReader.loadIndex(TribbleIndexedFeatureReader.java:163)
at htsjdk.tribble.TribbleIndexedFeatureReader.hasIndex(TribbleIndexedFeatureReader.java:227)
at org.broadinstitute.hellbender.engine.FeatureDataSource.(FeatureDataSource.java:251)
at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:202)
at org.broadinstitute.hellbender.engine.FeatureManager.initializeFeatureSources(FeatureManager.java:182)
at org.broadinstitute.hellbender.engine.FeatureManager.(FeatureManager.java:153)
at org.broadinstitute.hellbender.engine.ReadWalker.initializeFeatures(ReadWalker.java:73)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:558)
at org.broadinstitute.hellbender.engine.ReadWalker.onStartup(ReadWalker.java:55)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:134)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:179)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:198)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:152)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:195)
at org.broadinstitute.hellbender.Main.main(Main.java:275)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at htsjdk.tribble.index.IndexFactory.loadIndex(IndexFactory.java:181)
... 15 more
Caused by: java.io.EOFException
at htsjdk.tribble.util.LittleEndianInputStream.readInt(LittleEndianInputStream.java:66)
at htsjdk.tribble.index.interval.IntervalTreeIndex$ChrIndex.read(IntervalTreeIndex.java:220)
at htsjdk.tribble.index.AbstractIndex.read(AbstractIndex.java:404)
at htsjdk.tribble.index.interval.IntervalTreeIndex.(IntervalTreeIndex.java:53)
... 20 more
'''
Thanks for attention!

Tagged:

Answers

Sign In or Register to comment.