Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Picard CreateSequenceDictionary shows ERROR: Option 'REFERENCE' is required.

I am running picard on my university cluster which has picard/2.9.2 installed.

picard CreateSequenceDictionary \
R=newref_495.fa \
O=reference_495.dict

This is showing the error that:
ERROR: Option 'REFERENCE' is required.

I already have the option Reference. How can I solve this?

Answers

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @Rittika,

    Can you try changing your reference extension to fasta instead of fa? Otherwise, try using a newer version of Picard (we are on 2.18.27 currently). It may be that the issue discussed here could be related. I say this because typical invocations should involve java -jar picard.jar ....

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Rittika

    From the error looks like the tool is unable to find the ref file in the folder you are trying to run this command. Could you please run ls in the folder and send us the result.

  • RittikaRittika Member
    Hi @shlee,

    I tried using the java -jar picard. jar and changed the reference to fasta. I got the error:
    Error: Unable to access jarfile picard.jar

    when I tried with just picard, it gave me Reference is required.

    Hi @bhanuGandham ,
    [[email protected] ElephantERR2260495]$ ls | grep new
    495_bwa_mem_ERR_newref.e43721
    495_bwa_mem_ERR_newref.e57441
    495_bwa_mem_ERR_newref.o43721
    495_bwa_mem_ERR_newref.o57441
    bwa_mem_ERR_newref_496.e4028
    bwa_mem_ERR_newref_496.o4028
    new_bwamemout_495.bam
    new_bwamemout_495.sorted.bam
    new_exon_finder.py
    new_picard_bwamemout_495.sorted.bam
    newref_495.fa
    newref_495.fa.amb
    newref_495.fa.ann
    newref_495.fa.bwt
    newref_495.fa.pac
    newref_495.fa.sa
    newref_495.fasta

    Here is the ls details. I have tried putting the absolute path in the code as well.
  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    That's odd behavior @Rittika. I think the best thing to get you moving forward is for you to download a new jar.

    You have two choices for this.

    One, GATK4 wraps Picard tools that you invoke with the gatk launch script. I like to create a symlink to the gatk launch script on my system so that I can run gatk CreateSequenceDictionary ... from anywhere. Otherwise, invoke by providing the path to the gatk script in the downloaded folder.

    I see CreateSequenceDictionary tool documentation example commands on the GATK forum still only reflect Picard stand-alone syntax. However, you see the argument descriptions provide the correct syntax. So, when invoking Picard tools from the GATK4 jar, be sure to use GATK syntax, e.g. instead of R=hg38.fasta type instead -R hg38.fasta.

    Two, you can find standalone Picard jars at https://github.com/broadinstitute/picard/releases. Be sure to download the picard.jar version. Invokation is java -jar picard.jar.

    Your v2.9.2 goes waaaay back to May 8, 2017. Let us know if updating to a newer version of the tools fixes things for you. Be sure to run with the correct Java environment. The latest versions still use Java 8.

  • TwesidaveTwesidave Member
    Hello, I am trying to use CreateSequenceDictionary to prep my hg19ref.fa file for downstream variant calling analyses. However, the resulting .dict file has records for only chr1.

    @HD VN:1.6
    @SQ SN:chr1 LN:249250621 M5:1b22b98cdeb4a9304cb5d48026a85128 UR:file:/gatk/my_data/refs/hg19ref.fa

    How can I ensure that SAM-like headers for other chromosomes are included?
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Twesidave

    please post the exact command you are using and the version of gatk.

  • TwesidaveTwesidave Member
    Hi @bhanuGandham

    I am running the command on gatk docker,

    (gatk) [email protected]:/gatk# gatk CreateSequenceDictionary -R my_data/refs/hg19ref.fa -O my_data/hg19ref.dict
    Using GATK jar /gatk/gatk-package-4.1.0.0-local.jar
    Running:
    -----
    ----
    ---

    (gatk) [email protected]:/gatk# cat my_data/hg19ref.dict
    @HD VN:1.6
    @SQ SN:chr1 LN:249250621 M5:1b22b98cdeb4a9304cb5d48026a85128 UR:file:/gatk/my_data/refs/hg19ref.fa
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    @Twesidave

    Was this issue resolved and is this connected with your other question in this thread:https://gatkforums.broadinstitute.org/gatk/discussion/comment/58240#Comment_58240 ?

  • TwesidaveTwesidave Member

    @bhanuGandham

    It was not resolved. I created the sequence dictionary manually and ran into the issue in the other thread. Would be nice to have it automated next time :smile:

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Twesidave

    Can you please share your input ref file and I will try to replicate the issue on my end. Please find details of how you can share the data here: https://software.broadinstitute.org/gatk/guide/article?id=1894

  • TwesidaveTwesidave Member

    Hi @bhanuGandham,

    I am having issues uploading the files to the ftp server. But it is the hg19 assembly. For the moment I have resorted to using the UCSC sequence dictionary from the resource bundle.

Sign In or Register to comment.