Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Issue of Haplotype call on a large chromosome (>536 Mb)

Hi
I tried to run HaplotypeCaller with GVCF mode. My reference genome is over 5 Gb in size. Below my code and error,

Using GATK jar /source/gatk-4.0.6.0/gatk-package-4.0.6.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -XX:+UseSerialGC -Xmx100g -jar /source/gatk-4.0.6.0/gatk-package-4.0.6.0-local.jar HaplotypeCaller -R /data/Pseudomolecule_v3.fasta -L /IntervalFiles/0003-scattered.intervals -I WGS_FTNO.cram -O result/0003-scattered.vcf.gz -mbq 20 --native-pair-hmm-threads 4 -ERC GVCF --verbosity ERROR
[August 1, 2018 11:32:11 AM CEST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2076049408
htsjdk.samtools.SAMException: Exception creating BAM index for slice slice: seqID 1, start 536834320, span 457789, records 259850.
at htsjdk.samtools.CRAMBAIIndexer.processSingleReferenceSlice(CRAMBAIIndexer.java:194)
at htsjdk.samtools.cram.CRAIIndex.openCraiFileAsBaiStream(CRAIIndex.java:180)
at htsjdk.samtools.SamIndexes.asBaiSeekableStreamOrNull(SamIndexes.java:78)
at htsjdk.samtools.CRAMFileReader.initWithStreams(CRAMFileReader.java:228)
at htsjdk.samtools.CRAMFileReader.(CRAMFileReader.java:219)
at htsjdk.samtools.SamReaderFactory$SamReaderFactoryImpl.open(SamReaderFactory.java:422)
at htsjdk.samtools.SamReaderFactory.open(SamReaderFactory.java:105)
at org.broadinstitute.hellbender.engine.ReadsDataSource.(ReadsDataSource.java:227)
at org.broadinstitute.hellbender.engine.ReadsDataSource.(ReadsDataSource.java:162)
at org.broadinstitute.hellbender.engine.GATKTool.initializeReads(GATKTool.java:387)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:636)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.onStartup(AssemblyRegionWalker.java:156)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:133)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:180)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:199)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 32770
at htsjdk.samtools.CRAMBAIIndexer$BAMIndexBuilder.processSingleReferenceSlice(CRAMBAIIndexer.java:354)
at htsjdk.samtools.CRAMBAIIndexer$BAMIndexBuilder.access$100(CRAMBAIIndexer.java:227)
at htsjdk.samtools.CRAMBAIIndexer.processSingleReferenceSlice(CRAMBAIIndexer.java:192)
... 17 more

Does GATK4 handle large single chromosome ? Is there any solution ?

Answers

Sign In or Register to comment.