Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on October 14, 2019, due to the U.S. holiday. We will return to monitoring the forum on October 15.

GATK v4.0.1.1 HaplotypeCaller

I am using GATK v4.0.1.1 HaplotypeCaller for variant analysis. (paired-end DNA sequenced data mapped to the reference using BWA mem).

The command I used;
“gatk HaplotypeCaller –R Reference.fna –I input.bam –O output.vcf”

It runs for a while (couple of seconds) but does not produce an output. No error message was given. Am I doing this right? Any help is appreciated.

Answers

  • SheilaSheila admin Broad InstituteMember, Broadie, Moderator admin

    @SKW
    Hi,

    What do you mean "does not produce an output"? Can you post the entire log output (all the lines after the tool starts to run)?

    Thanks,
    Sheila

  • Hi Sheila,
    After running the command there's no output file to be found. Any advice would be appreciated.
    Thank you!
    SKW

    Here is the entire log output;

    [[email protected] project]$ /local/cluster/bin/gatk HaplotypeCaller -R Reference_genome.fna -I GH-50-a_aln_pairs_mapped_sorted.bam -O GH-50-a_GATK_output.vcf
    Using GATK jar /local/cluster/gatk-4.0.1.1/gatk-package-4.0.1.1-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -jar /local/cluster/gatk-4.0.1.1/gatk-package-4.0.1.1-local.jar HaplotypeCaller -R Reference_genome.fna -I GH-50-a_aln_pairs_mapped_sorted.bam -O GH-50-a_GATK_output.vcf
    18:50:41.895 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/local/cluster/gatk-4.0.1.1/gatk-package-4.0.1.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
    18:50:42.173 INFO HaplotypeCaller - ------------------------------------------------------------
    18:50:42.174 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.0.1.1
    18:50:42.174 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
    18:50:42.175 INFO HaplotypeCaller - Executing as [email protected] on Linux v3.10.0-327.el7.x86_64 amd64
    18:50:42.175 INFO HaplotypeCaller - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_71-b15
    18:50:42.176 INFO HaplotypeCaller - Start Date/Time: February 19, 2018 6:50:41 PM PST
    18:50:42.176 INFO HaplotypeCaller - ------------------------------------------------------------
    18:50:42.176 INFO HaplotypeCaller - ------------------------------------------------------------
    18:50:42.177 INFO HaplotypeCaller - HTSJDK Version: 2.14.1
    18:50:42.177 INFO HaplotypeCaller - Picard Version: 2.17.2
    18:50:42.177 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 1
    18:50:42.177 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    18:50:42.178 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    18:50:42.178 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    18:50:42.178 INFO HaplotypeCaller - Deflater: IntelDeflater
    18:50:42.178 INFO HaplotypeCaller - Inflater: IntelInflater
    18:50:42.178 INFO HaplotypeCaller - GCS max retries/reopens: 20
    18:50:42.178 INFO HaplotypeCaller - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
    18:50:42.179 INFO HaplotypeCaller - Initializing engine
    18:50:43.288 INFO HaplotypeCaller - Done initializing engine
    18:50:43.393 INFO HaplotypeCallerEngine - Disabling physical phasing, which is supported only for reference-model confidence output
    18:50:44.112 INFO HaplotypeCaller - Shutting down engine
    [February 19, 2018 6:50:44 PM PST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.04 minutes.
    Runtime.totalMemory()=2451570688
    java.lang.IllegalArgumentException: samples cannot be empty
    at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:681)
    at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.ReferenceConfidenceModel.(ReferenceConfidenceModel.java:161)
    at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.initialize(HaplotypeCallerEngine.java:182)
    at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.(HaplotypeCallerEngine.java:160)
    at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.(HaplotypeCallerEngine.java:151)
    at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.onTraversalStart(HaplotypeCaller.java:197)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:891)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:136)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:179)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:198)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:153)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:195)
    at org.broadinstitute.hellbender.Main.main(Main.java:277)
    14.625u 1.297s 0:05.17 307.7% 0+0k 0+1184io 0pf+0w

  • SkyWarriorSkyWarrior ✭✭✭ TurkeyMember ✭✭✭
    edited February 2018

    There is no output because there is an error.

    java.lang.IllegalArgumentException: samples cannot be empty

    This is an error statement.

    There seems to be a problem with the command or with your input files.

  • SheilaSheila admin Broad InstituteMember, Broadie, Moderator admin
    edited February 2018

    @SKW
    Hi,

    Indeed @SkyWarrior is correct. Can you run ValidateSamFile on your input BAM file.

    Thanks,
    Sheila

  • RaziRazi Member
    Hi,

    I want to create Gvcf files for Fst estimate. I use version 4.1.2.0.

    This is my command:
    gatk HaplotypeCaller -R ref.fasta -I input.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -ERC GVCF -variant_index_type LINEAR -variant_index_parameter 128000 -o out.vcf

    But, I got this error:
    A USER ERROR has occurred: genotyping_mode is not a recognized option!

    What should I do? What's wrong? Would you please advice me?

    Thanks,

    Razi
  • RaziRazi Member
    Hi,
    Regarding my last question, I changed the command to:

    gatk HaplotypeCaller -R ref.fasta -I input.bam -ERC GVCF -O out.g.vcf

    Now, it is working. Do you think is there any other problem? I don't like surprise after the long-running time. I like to have the right files as input for Fst!

    Best,
    Razi
  • bhanuGandhambhanuGandham admin Cambridge MAMember, Administrator, Broadie, Moderator admin

    @Razi

    Looks good to me. That is the basic haplotypecaller command so it should work.

Sign In or Register to comment.