Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

GATK Exception in HaplotypeCaller

Sarah61Sarah61 PakistanMember
Hi,

I am using gatk 4.0.8.1 HaplotypeCaller for making g.vcf. I am running follwing command
gatk --java-options "-Xms24g -Xmx48g" HaplotypeCaller -R new_hg38.fa -I S11_.sorted.BQRC.bam -O S11.g.vcf -L ../../nextera-dna-exome-targeted-regions-manifest-v1-2.bed --native-pair-hmm-threads 6 --min-base-quality-score 20 -stand-call-conf 30 --dbsnp /All_20180418.chr.hg38.vcf.gz -ERC GVCF -G StandardAnnotation -G AS_StandardAnnotation --read-validation-stringency SILENT --TMP_DIR scratch-2


But gatk is shutting down with exception.Here is log of exception

23:56:09.618 WARN GATKAnnotationPluginDescriptor - Redundant enabled annotation group (StandardAnnotation) is enabled for this tool by default
23:56:09.692 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/hpcc/tools/gatk-4.0.8.1/gatk-package-4.0.8.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
23:56:10.099 INFO HaplotypeCaller - ------------------------------------------------------------
23:56:10.099 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.0.8.1
23:56:10.099 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
23:56:10.100 INFO HaplotypeCaller - Executing as [email protected] on Linux v4.4.0-159-generic amd64
23:56:10.100 INFO HaplotypeCaller - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_222-8u222-b10-1ubuntu1~16.04.1-b10
23:56:10.100 INFO HaplotypeCaller - Start Date/Time: September 16, 2019 11:56:09 PM PKT
23:56:10.100 INFO HaplotypeCaller - ------------------------------------------------------------
23:56:10.100 INFO HaplotypeCaller - ------------------------------------------------------------
23:56:10.101 INFO HaplotypeCaller - HTSJDK Version: 2.16.0
23:56:10.101 INFO HaplotypeCaller - Picard Version: 2.18.7
23:56:10.101 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
23:56:10.101 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
23:56:10.102 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
23:56:10.102 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
23:56:10.102 INFO HaplotypeCaller - Deflater: IntelDeflater
23:56:10.102 INFO HaplotypeCaller - Inflater: IntelInflater
23:56:10.102 INFO HaplotypeCaller - GCS max retries/reopens: 20
23:56:10.102 INFO HaplotypeCaller - Using google-cloud-java fork https://github.com/broadinstitute/google-cloud-java/releases/tag/0.20.5-alpha-GCS-RETRY-FIX
23:56:10.102 INFO HaplotypeCaller - Initializing engine
23:56:10.936 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/2d6b1dc8-eccd-46f4-a5b2-39966cd786c9/data-base/All_20180418.chr.hg38.vcf.gz
23:56:11.249 INFO FeatureManager - Using codec BEDCodec to read file file:///mnt/2d6b1dc8-eccd-46f4-a5b2-39966cd786c9/scratch-2/exome-run2/cleanfastq/part3/newwork/../../nextera-dna-exome-targeted-regions-manifest-v1-2.bed
23:56:13.499 INFO IntervalArgumentCollection - Processing 45326818 bp from intervals
23:56:13.651 WARN IndexUtils - Feature file "/mnt/2d6b1dc8-eccd-46f4-a5b2-39966cd786c9/data-base/All_20180418.chr.hg38.vcf.gz" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
23:56:14.042 INFO HaplotypeCaller - Shutting down engine
[September 16, 2019 11:56:14 PM PKT] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.07 minutes.
Runtime.totalMemory()=25739919360
java.lang.NullPointerException
at java.util.ComparableTimSort.countRunAndMakeAscending(ComparableTimSort.java:325)
at java.util.ComparableTimSort.sort(ComparableTimSort.java:202)
at java.util.Arrays.sort(Arrays.java:1312)
at java.util.Arrays.sort(Arrays.java:1506)
at java.util.ArrayList.sort(ArrayList.java:1462)
at java.util.Collections.sort(Collections.java:143)
at org.broadinstitute.hellbender.utils.IntervalUtils.sortAndMergeIntervals(IntervalUtils.java:459)
at org.broadinstitute.hellbender.utils.IntervalUtils.getIntervalsWithFlanks(IntervalUtils.java:955)
at org.broadinstitute.hellbender.utils.IntervalUtils.getIntervalsWithFlanks(IntervalUtils.java:970)
at org.broadinstitute.hellbender.engine.MultiIntervalLocalReadShard.<init>(MultiIntervalLocalReadShard.java:59)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.makeReadShards(AssemblyRegionWalker.java:195)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.onStartup(AssemblyRegionWalker.java:175)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:135)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:182)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:201)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)

Best Answers

  • SkyWarriorSkyWarrior Turkey ✭✭✭
    Accepted Answer

    When you use a bed file for these tools to jump from region to region HTSJDK requires a proper index to handle the lookup tables from the vcf index. A proper vcf index for jumping regions also requires a proper sequence dictionary plugged into the header of the vcf file. You may want to modify your DBSNP input file to include the sequence dictionary using picard UpdateVcfSequenceDictionary and of course this requires you to resort and reindex the vcf as well. As far as I remember HG38 dbsnp files also lack proper chromosome naming with "chr" prefix. That may require additional processing as well. But this was a record in my mind from sometime ago.

  • bhanuGandhambhanuGandham Cambridge MA admin
    Accepted Answer

    Hi @Sarah61

    In addition to what @SkyWarrior suggested, please also check if your nextera-dna-exome-targeted-regions-manifest-v1-2.bed file is compatible your reference, hg38 sequence dictionary.

    We have seen such errors previously when the bed file and reference files are from different builds. See this thread: https://gatkforums.broadinstitute.org/gatk/discussion/24414/java-lang-nullpointerexception-error/p1

Answers

  • Sarah61Sarah61 PakistanMember
    There is some kind of issue with nextera-dna-exome-targeted-regions-manifest-v1-2.bed file. When I remove this option command starts running. Can anyone tell me what is the issue with this file as I have run BaseRecalibrator and ApplyBQSR step with this file. These commands have run without any problem.
  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭
    Accepted Answer

    When you use a bed file for these tools to jump from region to region HTSJDK requires a proper index to handle the lookup tables from the vcf index. A proper vcf index for jumping regions also requires a proper sequence dictionary plugged into the header of the vcf file. You may want to modify your DBSNP input file to include the sequence dictionary using picard UpdateVcfSequenceDictionary and of course this requires you to resort and reindex the vcf as well. As far as I remember HG38 dbsnp files also lack proper chromosome naming with "chr" prefix. That may require additional processing as well. But this was a record in my mind from sometime ago.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    Accepted Answer

    Hi @Sarah61

    In addition to what @SkyWarrior suggested, please also check if your nextera-dna-exome-targeted-regions-manifest-v1-2.bed file is compatible your reference, hg38 sequence dictionary.

    We have seen such errors previously when the bed file and reference files are from different builds. See this thread: https://gatkforums.broadinstitute.org/gatk/discussion/24414/java-lang-nullpointerexception-error/p1

Sign In or Register to comment.