We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

GATK 4.0.0.0 HaplotypeCaller Error with -L chrM

When running the GATK 4.0.0.0 HaplotypeCaller with the following:

gatk HaplotypeCaller \
-R $REF \
-I $CRAM_DIR/$SAMPLE.cram \
--emit-ref-confidence GVCF \
-ploidy 1\
-L $TARGETS \
-O gvcf.$CENTER/$SAMPLE.raw.$TARGETS.g.vcf.gz

Where $TARGETS is chrM, the following error is generated. When run with chr1, no error occurs.....

Any suggestions for this?

Below are the outputs with chrM and chr1 respectively....

Running Haplotyper Caller to create g.VCF file

Using GATK jar /share/pkg/gatk/4.0.0.0/install/bin/gatk-package-4.0.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -jar /share/pkg/gatk/4.0.0.0/install/bin/gatk-package-4.0.0.0-local.jar HaplotypeCaller -R /restricted/projectnb/casa/ref/GRCh38_full_analysis_set_plus_decoy_hla.fa -I /restricted/projectnb/casa/wgs.hg38/adni/cram/ADNI_941_S_4420.hg38.realign.bqsr.cram --emit-ref-confidence GVCF -ploidy 1** -L chrM** -O gvcf.adni/ADNI_941_S_4420.hg38.realign.bqsr.raw.chrM.g.vcf.gz
21:27:26.515 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/share/pkg/gatk/4.0.0.0/install/bin/gatk-package-4.0.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
21:27:26.660 INFO HaplotypeCaller - ------------------------------------------------------------
21:27:26.661 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.0.0.0
21:27:26.661 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
21:27:26.661 INFO HaplotypeCaller - Executing as [email protected] on Linux v2.6.32-696.10.3.el6.x86_64 amd64
21:27:26.661 INFO HaplotypeCaller - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_151-b12
21:27:26.661 INFO HaplotypeCaller - Start Date/Time: January 13, 2018 9:27:26 PM EST
21:27:26.661 INFO HaplotypeCaller - ------------------------------------------------------------
21:27:26.661 INFO HaplotypeCaller - ------------------------------------------------------------
21:27:26.662 INFO HaplotypeCaller - HTSJDK Version: 2.13.2
21:27:26.662 INFO HaplotypeCaller - Picard Version: 2.17.2
21:27:26.662 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 1
21:27:26.662 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:27:26.662 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:27:26.662 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:27:26.663 INFO HaplotypeCaller - Deflater: IntelDeflater
21:27:26.663 INFO HaplotypeCaller - Inflater: IntelInflater
21:27:26.663 INFO HaplotypeCaller - GCS max retries/reopens: 20
21:27:26.663 INFO HaplotypeCaller - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
21:27:26.663 INFO HaplotypeCaller - Initializing engine
21:27:28.356 INFO IntervalArgumentCollection - Processing 16569 bp from intervals
21:27:28.375 INFO HaplotypeCaller - Done initializing engine
21:27:28.438 INFO HaplotypeCallerEngine - Currently, physical phasing is only available for diploid samples.
21:27:28.438 INFO HaplotypeCallerEngine - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
21:27:28.438 INFO HaplotypeCallerEngine - All sites annotated with PLs forced to true for reference-model confidence output
21:27:29.435 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/share/pkg/gatk/4.0.0.0/install/bin/gatk-package-4.0.0.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
21:27:29.440 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/share/pkg/gatk/4.0.0.0/install/bin/gatk-package-4.0.0.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
21:27:29.457 WARN NativeLibraryLoader - Unable to load libgkl_pairhmm_omp.so from native/libgkl_pairhmm_omp.so (/tmp/farrell/libgkl_pairhmm_omp6337809329824575621.so: /usr/lib64/libgomp.so.1: version `GOMP_4.0' not found (required by /tmp/farrell/libgkl_pairhmm_omp6337809329824575621.so))
21:27:29.458 INFO PairHMM - OpenMP multi-threaded AVX-accelerated native PairHMM implementation is not supported
21:27:29.458 INFO NativeLibraryLoader - Loading libgkl_pairhmm.so from jar:file:/share/pkg/gatk/4.0.0.0/install/bin/gatk-package-4.0.0.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm.so
21:27:29.513 WARN IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
21:27:29.514 WARN IntelPairHmm - Ignoring request for 4 threads; not using OpenMP implementation
21:27:29.514 INFO PairHMM - Using the AVX-accelerated native PairHMM implementation
21:27:29.609 INFO ProgressMeter - Starting traversal
21:27:29.610 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
21:27:30.691 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.0
21:27:30.691 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 0.0
21:27:30.692 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 0.00 sec
21:27:30.692 INFO HaplotypeCaller - Shutting down engine
[January 13, 2018 9:27:30 PM EST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.07 minutes.
Runtime.totalMemory()=1702887424
java.lang.IllegalArgumentException: contig must be non-null and not equal to *, and start must be >= 1
contig = null
start = 1
at org.broadinstitute.hellbender.utils.read.SAMRecordToGATKReadAdapter.setPosition(SAMRecordToGATKReadAdapter.java:105)
at org.broadinstitute.hellbender.utils.clipping.ClippingOp.applyREVERT_SOFTCLIPPED_BASES(ClippingOp.java:177)
at org.broadinstitute.hellbender.utils.clipping.ClippingOp.apply(ClippingOp.java:82)
at org.broadinstitute.hellbender.utils.clipping.ReadClipper.clipRead(ReadClipper.java:145)
at org.broadinstitute.hellbender.utils.clipping.ReadClipper.clipRead(ReadClipper.java:126)
at org.broadinstitute.hellbender.utils.clipping.ReadClipper.revertSoftClippedBases(ReadClipper.java:437)
at org.broadinstitute.hellbender.utils.clipping.ReadClipper.revertSoftClippedBases(ReadClipper.java:447)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.AssemblyBasedCallerUtils.finalizeRegion(AssemblyBasedCallerUtils.java:83)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.AssemblyBasedCallerUtils.assembleReads(AssemblyBasedCallerUtils.java:243)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.callRegion(HaplotypeCallerEngine.java:505)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.apply(HaplotypeCaller.java:218)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:295)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:271)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:893)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:136)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:179)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:198)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:152)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:195)

at org.broadinstitute.hellbender.Main.main(Main.java:275)

When running with -L chr1 , no error occurs.....


Processing sample: ADNI_941_S_4420.hg38.realign.bqsr

Date: date

Running Haplotyper Caller to create g.VCF file

Using GATK jar /share/pkg/gatk/4.0.0.0/install/bin/gatk-package-4.0.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -jar /share/pkg/gatk/4.0.0.0/install/bin/gatk-package-4.0.0.0-local.jar HaplotypeCaller -R /restricted/projectnb/casa/ref/GRCh38_full_analysis_set_plus_decoy_hla.fa -I /restricted/projectnb/casa/wgs.hg38/adni/cram/ADNI_941_S_4420.hg38.realign.bqsr.cram --emit-ref-confidence GVCF -ploidy 1 -L chr1 -O gvcf.adni/ADNI_941_S_4420.hg38.realign.bqsr.raw.chr1.g.vcf.gz
21:29:03.400 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/share/pkg/gatk/4.0.0.0/install/bin/gatk-package-4.0.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
21:29:03.568 INFO HaplotypeCaller - ------------------------------------------------------------
21:29:03.568 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.0.0.0
21:29:03.568 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
21:29:03.568 INFO HaplotypeCaller - Executing as [email protected] on Linux v2.6.32-696.10.3.el6.x86_64 amd64
21:29:03.568 INFO HaplotypeCaller - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_151-b12
21:29:03.569 INFO HaplotypeCaller - Start Date/Time: January 13, 2018 9:29:03 PM EST
21:29:03.569 INFO HaplotypeCaller - ------------------------------------------------------------
21:29:03.569 INFO HaplotypeCaller - ------------------------------------------------------------
21:29:03.570 INFO HaplotypeCaller - HTSJDK Version: 2.13.2
21:29:03.570 INFO HaplotypeCaller - Picard Version: 2.17.2
21:29:03.570 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 1
21:29:03.570 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:29:03.570 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:29:03.570 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:29:03.571 INFO HaplotypeCaller - Deflater: IntelDeflater
21:29:03.571 INFO HaplotypeCaller - Inflater: IntelInflater
21:29:03.571 INFO HaplotypeCaller - GCS max retries/reopens: 20
21:29:03.571 INFO HaplotypeCaller - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
21:29:03.571 INFO HaplotypeCaller - Initializing engine
21:29:05.158 INFO IntervalArgumentCollection - Processing 248956422 bp from intervals
21:29:05.181 INFO HaplotypeCaller - Done initializing engine
21:29:05.240 INFO HaplotypeCallerEngine - Currently, physical phasing is only available for diploid samples.
21:29:05.241 INFO HaplotypeCallerEngine - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
21:29:05.241 INFO HaplotypeCallerEngine - All sites annotated with PLs forced to true for reference-model confidence output
21:29:06.153 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/share/pkg/gatk/4.0.0.0/install/bin/gatk-package-4.0.0.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
21:29:06.155 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/share/pkg/gatk/4.0.0.0/install/bin/gatk-package-4.0.0.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
21:29:06.171 WARN NativeLibraryLoader - Unable to load libgkl_pairhmm_omp.so from native/libgkl_pairhmm_omp.so (/tmp/farrell/libgkl_pairhmm_omp7461956491412004263.so: /usr/lib64/libgomp.so.1: version `GOMP_4.0' not found (required by /tmp/farrell/libgkl_pairhmm_omp7461956491412004263.so))
21:29:06.171 INFO PairHMM - OpenMP multi-threaded AVX-accelerated native PairHMM implementation is not supported
21:29:06.171 INFO NativeLibraryLoader - Loading libgkl_pairhmm.so from jar:file:/share/pkg/gatk/4.0.0.0/install/bin/gatk-package-4.0.0.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm.so
21:29:06.223 WARN IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
21:29:06.224 WARN IntelPairHmm - Ignoring request for 4 threads; not using OpenMP implementation
21:29:06.224 INFO PairHMM - Using the AVX-accelerated native PairHMM implementation
21:29:06.334 INFO ProgressMeter - Starting traversal
21:29:06.335 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
21:29:16.518 INFO ProgressMeter - chr1:108813 0.2 530 3122.9
21:29:26.541 INFO ProgressMeter - chr1:636367 0.3 2560 7601.7
^C---------------------------------------------------------

Best Answer

Answers

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭
    edited January 2018

    Can you check the reference genome hashes on the CRAM file using cramtools or samtools?

    There may be no chrM contig present on the hash list. Or your chrM hash does not match the chrM hash on the cram.

  • jfarrelljfarrell Member ✭✭

    HaplotypeCaller runs fine with version 3.7 on the chrM contig, so I don't believe it is related to the reference.

    INFO 15:18:09,837 HelpFormatter - ----------------------------------------------------------------------------------
    INFO 15:18:09,840 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.7-0-gcfedb67, Compiled 2016/12/12 11:21:18
    INFO 15:18:09,840 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
    INFO 15:18:09,840 HelpFormatter - For support and documentation go to https://software.broadinstitute.org/gatk
    INFO 15:18:09,841 HelpFormatter - [Sun Jan 14 15:18:09 EST 2018] Executing on Linux 2.6.32-696.10.3.el6.x86_64 amd64
    INFO 15:18:09,841 HelpFormatter - Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14
    INFO 15:18:09,846 HelpFormatter - Program Args: -T HaplotypeCaller -R /restricted/projectnb/casa/ref/GRCh38_full_analysis_set_plus_decoy_hla.fa -I /restricted/projectnb/casa/wgs.hg38/adni/cram/ADNI_941_S_4420.hg38.realign.bqsr.cram --emitRefConfidence GVCF -ploidy 1 -L chrM -o gvcf.adni/ADNI_941_S_4420.hg38.realign.bqsr.37.chrM.g.vcf.gz
    INFO 15:18:09,852 HelpFormatter - Executing as [email protected] on Linux 2.6.32-696.10.3.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14.
    INFO 15:18:09,852 HelpFormatter - Date/Time: 2018/01/14 15:18:09
    INFO 15:18:09,853 HelpFormatter - ----------------------------------------------------------------------------------
    INFO 15:18:09,853 HelpFormatter - ----------------------------------------------------------------------------------
    WARN 15:18:09,860 GATKVCFUtils - Creating Tabix index for gvcf.adni/ADNI_941_S_4420.hg38.realign.bqsr.37.chrM.g.vcf.gz, ignoring user-specified index type and parameter
    INFO 15:18:09,873 GenomeAnalysisEngine - Strictness is SILENT
    INFO 15:18:11,526 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 500
    INFO 15:18:11,533 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO 15:18:12,577 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 1.04
    INFO 15:18:13,327 HCMappingQualityFilter - Filtering out reads with MAPQ < 20
    INFO 15:18:14,468 IntervalUtils - Processing 16569 bp from intervals
    INFO 15:18:14,603 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
    INFO 15:18:15,077 GenomeAnalysisEngine - Done preparing for traversal
    INFO 15:18:15,077 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO 15:18:15,078 ProgressMeter - | processed | time | per 1M | | total | remaining
    INFO 15:18:15,078 ProgressMeter - Location | active regions | elapsed | active regions | completed | runtime | runtime
    INFO 15:18:15,079 HaplotypeCaller - Currently, physical phasing is not available when ploidy is different than 2; therefore it won't be performed
    INFO 15:18:15,079 HaplotypeCaller - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
    INFO 15:18:15,080 HaplotypeCaller - All sites annotated with PLs forced to true for reference-model confidence output
    WARN 15:18:15,194 InbreedingCoeff - Annotation will not be calculated. InbreedingCoeff requires at least 10 unrelated samples.
    INFO 15:18:15,485 HaplotypeCaller - Using global mismapping rate of 45 => -4.5 in log10 likelihood units
    Using AVX accelerated implementation of PairHMM
    INFO 15:18:17,077 VectorLoglessPairHMM - libVectorLoglessPairHMM unpacked successfully from GATK jar file
    INFO 15:18:17,078 VectorLoglessPairHMM - Using vectorized implementation of PairHMM
    WARN 15:18:21,758 HaplotypeScore - Annotation will not be calculated, must be called from UnifiedGenotyper, not org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller
    INFO 15:18:45,082 ProgressMeter - chrM:3069 0.0 30.0 s 49.6 w 18.5% 2.7 m 2.2 m
    INFO 15:19:15,084 ProgressMeter - chrM:7391 0.0 60.0 s 99.2 w 44.6% 2.2 m 74.0 s

    For the 4.0 version, I found if I use -L chr.list with chr.list containing:

    chr22
    chrM

    It runs fine and completed genotyping chrM.

    When the chr22 is removed leaving just chrM, it fails. It seems related to chrM being first on the list of contigs for the -L parameter.

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @jfarrell
    Hi,

    Can you submit some test data? Instructions are here.

    Thanks,
    Sheila

  • jfarrelljfarrell Member ✭✭

    Before I submit the test data, I found these bug reports for what appears to be the same issue.

    https://github.com/broadinstitute/gatk/issues/3845

    https://github.com/broadinstitute/gatk/issues/3466

    https://github.com/broadinstitute/gatk/pull/4080

    There was a fix about 10 days ago. Was the fix part of the release of GATK 4.0?

  • jfarrelljfarrell Member ✭✭

    The fix was in version 4.0.0 so issue 3845 has been re-opened.

    I uploaded some test data (a slice of a cram file with a script that triggers the error). The name of the file is issue.3845.tar.gz

  • jfarrelljfarrell Member ✭✭
    Accepted Answer

    Issue 3845 is now closed with a fix. It looks like the fix will be available in the next release (4.0.1.0).

Sign In or Register to comment.