We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

BaseRecalibrator : getContigNames(SequenceDictionaryUtils.java:463)

Paul_ArthurPaul_Arthur ParisMember
edited July 2019 in Ask the GATK team
Hi,

I'm trying to use the BaseRecalibrator tool on a BAM file but the program doesn't run to the end. The messages returned by the tool did not allow me to correct the error by myself. I am running version 4.1.2.0 of GATK4.

Here is the complete message:

```
16:09:12.733 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data2/home/pamesl/miniconda3/envs/smk_env/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jul 11, 2019 4:09:14 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
16:09:14.487 INFO BaseRecalibrator - ------------------------------------------------------------
16:09:14.488 INFO BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.1.2.0
16:09:14.488 INFO BaseRecalibrator - For support and documentation go to
16:09:14.488 INFO BaseRecalibrator - Executing as [email protected] on Linux v2.6.32-573.7.1.el6.x86_64 amd64
16:09:14.489 INFO BaseRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
16:09:14.489 INFO BaseRecalibrator - Start Date/Time: 11 juillet 2019 16:09:12 CEST
16:09:14.489 INFO BaseRecalibrator - ------------------------------------------------------------
16:09:14.489 INFO BaseRecalibrator - ------------------------------------------------------------
16:09:14.490 INFO BaseRecalibrator - HTSJDK Version: 2.19.0
16:09:14.490 INFO BaseRecalibrator - Picard Version: 2.19.0
16:09:14.490 INFO BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
16:09:14.491 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
16:09:14.491 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
16:09:14.491 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
16:09:14.491 INFO BaseRecalibrator - Deflater: IntelDeflater
16:09:14.491 INFO BaseRecalibrator - Inflater: IntelInflater
16:09:14.491 INFO BaseRecalibrator - GCS max retries/reopens: 20
16:09:14.491 INFO BaseRecalibrator - Requester pays: disabled
16:09:14.492 INFO BaseRecalibrator - Initializing engine
16:09:15.263 INFO FeatureManager - Using codec VCFCodec to read file file:///data1/scratch/pamesl/projet_cbf/data/dbSNP/dbsnp_138.hg19.vcf.gz
16:09:15.411 INFO FeatureManager - Using codec VCFCodec to read file file:///data1/scratch/pamesl/projet_cbf/data/mills_1000G/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf
16:09:15.428 INFO BaseRecalibrator - Shutting down engine
[11 juillet 2019 16:09:15 CEST] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.05 minutes.
Runtime.totalMemory()=2224553984
java.lang.NullPointerException
at org.broadinstitute.hellbender.utils.SequenceDictionaryUtils.getContigNames(SequenceDictionaryUtils.java:463)
at org.broadinstitute.hellbender.utils.SequenceDictionaryUtils.getCommonContigsByName(SequenceDictionaryUtils.java:457)
at org.broadinstitute.hellbender.utils.SequenceDictionaryUtils.compareDictionaries(SequenceDictionaryUtils.java:234)
at org.broadinstitute.hellbender.utils.SequenceDictionaryUtils.validateDictionaries(SequenceDictionaryUtils.java:150)
at org.broadinstitute.hellbender.utils.SequenceDictionaryUtils.validateDictionaries(SequenceDictionaryUtils.java:98)
at org.broadinstitute.hellbender.engine.GATKTool.validateSequenceDictionaries(GATKTool.java:760)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:702)
at org.broadinstitute.hellbender.engine.ReadWalker.onStartup(ReadWalker.java:50)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:137)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
at org.broadinstitute.hellbender.Main.main(Main.java:291)
Using GATK jar /data2/home/pamesl/miniconda3/envs/smk_env/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /data2/home/pamesl/miniconda3/envs/smk_env/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar BaseRecalibrator -I /data1/scratch/pamesl/projet_cbf/data/bam/SJCBF016_G-C0DG1ACXX.5_marked_duplicates.bam -R /data1/scratch/pamesl/projet_cbf/data/hg19_data/reference_hg19/ucsc.hg19.fasta.gz --known-sites /data1/scratch/pamesl/projet_cbf/data/dbSNP/dbsnp_138.hg19.vcf.gz --known-sites /data1/scratch/pamesl/projet_cbf/data/mills_1000G/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf -O /data1/scratch/pamesl/projet_cbf/data/bam/recal_data_SJCBF016_G-C0DG1ACXX.5.table
```

I checked the validity of the BAM file SJCBF016_G-C0DG1ACXX.5_marked_duplicates.bam using the ValidateSamFile tool and got the following result:

```
No errors found
Tool returned:
0
```

I have a feeling that the problem comes from my Mills_and_1000G_gold_standard files.indels.hg19.sites.vcf, dbsnp_138.hg19.vcf.gz or my reference file ucsc.hg19.fasta.gz but I don't know which way to go.

Edit: I will perform ValidateVariants on each VCF files and post results tomorrow.

Best regards,

Paul-Arthur
Post edited by Paul_Arthur on

Best Answer

  • Paul_ArthurPaul_Arthur Paris
    Accepted Answer
    After deleting and recreating the different files, the problem is solved. The defective file was the .dict file, although I don't understand where the initial error came from.

Answers

  • Paul_ArthurPaul_Arthur ParisMember
    Here are the outputs of ValidateVariants on my files:

    ```
    09:42:13.154 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data2/home/pamesl/miniconda3/envs/gatk4_4.1.2.0_e
    nv/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Jul 12, 2019 9:42:14 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    09:42:14.894 INFO ValidateVariants - ------------------------------------------------------------
    09:42:14.895 INFO ValidateVariants - The Genome Analysis Toolkit (GATK) v4.1.2.0
    09:42:14.895 INFO ValidateVariants - For support and documentation go to
    09:42:14.895 INFO ValidateVariants - Executing as [email protected] on Linux v2.6.32-573.7.1.el6.x86_64 amd64
    09:42:14.896 INFO ValidateVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
    09:42:14.896 INFO ValidateVariants - Start Date/Time: 12 juillet 2019 09:42:13 CEST
    09:42:14.896 INFO ValidateVariants - ------------------------------------------------------------
    09:42:14.896 INFO ValidateVariants - ------------------------------------------------------------
    09:42:14.897 INFO ValidateVariants - HTSJDK Version: 2.19.0
    09:42:14.897 INFO ValidateVariants - Picard Version: 2.19.0
    09:42:14.898 INFO ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    09:42:14.898 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    09:42:14.898 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    09:42:14.898 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    09:42:14.898 INFO ValidateVariants - Deflater: IntelDeflater
    09:42:14.898 INFO ValidateVariants - Inflater: IntelInflater
    09:42:14.899 INFO ValidateVariants - GCS max retries/reopens: 20
    09:42:14.899 INFO ValidateVariants - Requester pays: disabled
    09:42:14.899 INFO ValidateVariants - Initializing engine
    09:42:15.555 INFO FeatureManager - Using codec VCFCodec to read file file:///data1/scratch/pamesl/projet_cbf/data/dbSNP/dbsnp_138.hg19.vcf.gz
    09:42:15.921 INFO ValidateVariants - Done initializing engine
    09:42:15.921 INFO ProgressMeter - Starting traversal
    09:42:15.921 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
    09:42:25.927 INFO ProgressMeter - chr1:85219337 0.2 1773000 10633746.5
    09:42:35.926 INFO ProgressMeter - chr1:203432070 0.3 3720000 11157768.4
    09:42:45.928 INFO ProgressMeter - chr2:41628100 0.5 5644000 11285366.7
    09:42:55.932 INFO ProgressMeter - chr2:139260385 0.7 7620000 11427143.2
    09:43:05.937 INFO ProgressMeter - chr2:236480298 0.8 9564000 11473128.6
    09:43:15.939 INFO ProgressMeter - chr3:82594206 1.0 11532000 11528541.4
    09:43:25.942 INFO ProgressMeter - chr3:181681522 1.2 13506000 11573099.5
    09:43:35.945 INFO ProgressMeter - chr4:72864798 1.3 15489000 11613266.0
    09:43:45.948 INFO ProgressMeter - chr4:170394397 1.5 17453000 11631843.8
    09:43:55.951 INFO ProgressMeter - chr5:73338396 1.7 19428000 11653304.0
    09:44:05.952 INFO ProgressMeter - chr5:168904051 1.8 21399000 11668999.4
    09:44:15.952 INFO ProgressMeter - chr6:76816892 2.0 23373000 11683481.8
    09:44:25.954 INFO ProgressMeter - chr6:170470765 2.2 25329000 11687340.9
    09:44:35.957 INFO ProgressMeter - chr7:90139660 2.3 27303000 11698277.6
    09:44:45.959 INFO ProgressMeter - chr8:16590512 2.5 29272000 11705834.5
    09:44:55.961 INFO ProgressMeter - chr8:114349378 2.7 31246000 11714321.4
    09:45:05.963 INFO ProgressMeter - chr9:81277834 2.8 33239000 11728514.1
    09:45:15.967 INFO ProgressMeter - chr10:29180143 3.0 35206000 11732400.2
    09:45:25.968 INFO ProgressMeter - chr10:126674848 3.2 37170000 11734991.9
    09:45:35.972 INFO ProgressMeter - chr11:81324426 3.3 39136000 11737865.5
    09:45:45.979 INFO ProgressMeter - chr12:38295747 3.5 41091000 11737044.1
    09:45:55.985 INFO ProgressMeter - chr12:132001225 3.7 43051000 11737767.2
    09:46:05.987 INFO ProgressMeter - chr13:110019237 3.8 45006000 11737327.5
    09:46:15.989 INFO ProgressMeter - chr14:104967730 4.0 46957000 11735924.8
    09:46:25.992 INFO ProgressMeter - chr16:5281264 4.2 48917000 11736793.7
    09:46:35.992 INFO ProgressMeter - chr17:7334268 4.3 50895000 11741793.6
    09:46:45.996 INFO ProgressMeter - chr18:22883277 4.5 52861000 11743626.8
    09:46:56.000 INFO ProgressMeter - chr19:35290321 4.7 54825000 11744900.5
    09:47:06.001 INFO ProgressMeter - chr20:62705429 4.8 56805000 11749557.9
    09:47:16.004 INFO ProgressMeter - chrX:22158309 5.0 58786000 11753948.1
    09:47:26.004 INFO ProgressMeter - chrY:14482729 5.2 60807000 11765946.5
    09:47:26.248 INFO ProgressMeter - chrY:59338394 5.2 60860307 11767002.0
    09:47:26.249 INFO ProgressMeter - Traversal complete. Processed 60860307 total variants in 5.2 minutes.
    09:47:26.249 INFO ValidateVariants - Shutting down engine
    [12 juillet 2019 09:47:26 CEST] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 5.22 minutes.
    Runtime.totalMemory()=1924136960
    Using GATK jar /data2/home/pamesl/miniconda3/envs/gatk4_4.1.2.0_env/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /data2/home/pamesl/miniconda3/envs/gatk4_4.1.2.0_env/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar ValidateVariants -V /data1/scratch/pamesl/projet_cbf/data/dbSNP/dbsnp_138.hg19.vcf.gz
    ```

    and

    ```
    09:43:43.281 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data2/home/pamesl/miniconda3/envs/gatk4_4.1.2.0_env/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Jul 12, 2019 9:43:45 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    09:43:45.017 INFO ValidateVariants - ------------------------------------------------------------
    09:43:45.018 INFO ValidateVariants - The Genome Analysis Toolkit (GATK) v4.1.2.0
    09:43:45.018 INFO ValidateVariants - For support and documentation go to
    09:43:45.019 INFO ValidateVariants - Executing as [email protected] on Linux v2.6.32-573.7.1.el6.x86_64 amd64
    09:43:45.019 INFO ValidateVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
    09:43:45.019 INFO ValidateVariants - Start Date/Time: 12 juillet 2019 09:43:43 CEST
    09:43:45.019 INFO ValidateVariants - ------------------------------------------------------------
    09:43:45.020 INFO ValidateVariants - ------------------------------------------------------------
    09:43:45.021 INFO ValidateVariants - HTSJDK Version: 2.19.0
    09:43:45.021 INFO ValidateVariants - Picard Version: 2.19.0
    09:43:45.021 INFO ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    09:43:45.021 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    09:43:45.021 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    09:43:45.021 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    09:43:45.021 INFO ValidateVariants - Deflater: IntelDeflater
    09:43:45.022 INFO ValidateVariants - Inflater: IntelInflater
    09:43:45.022 INFO ValidateVariants - GCS max retries/reopens: 20
    09:43:45.022 INFO ValidateVariants - Requester pays: disabled
    09:43:45.022 INFO ValidateVariants - Initializing engine
    09:43:45.692 INFO FeatureManager - Using codec VCFCodec to read file file:///data1/scratch/pamesl/projet_cbf/data/mills_1000G/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf
    09:43:45.823 INFO ValidateVariants - Done initializing engine
    09:43:45.824 INFO ProgressMeter - Starting traversal
    09:43:45.825 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
    09:43:50.024 INFO ProgressMeter - chrX:151966000 0.1 1274580 18221300.9
    09:43:50.024 INFO ProgressMeter - Traversal complete. Processed 1274580 total variants in 0.1 minutes.
    09:43:50.024 INFO ValidateVariants - Shutting down engine
    [12 juillet 2019 09:43:50 CEST] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 0.11 minutes.
    Runtime.totalMemory()=2469920768
    Using GATK jar /data2/home/pamesl/miniconda3/envs/gatk4_4.1.2.0_env/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /data2/home/pamesl/miniconda3/envs/gatk4_4.1.2.0_env/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar ValidateVariants -V /data1/scratch/pamesl/projet_cbf/data/mills_1000G/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf
    ```
  • Paul_ArthurPaul_Arthur ParisMember
    Accepted Answer
    After deleting and recreating the different files, the problem is solved. The defective file was the .dict file, although I don't understand where the initial error came from.
Sign In or Register to comment.