Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

Help with GenotypeGVCFs in GATK4

I am trying to run GenotypeGVCFs on a database I just created using GenomicsDBImport. However, I am getting an error I don't know how to fix. The error is:

A USER ERROR has occurred: n is not a recognized option

The script I'm using is here:

singularity exec gatk-4.img \
/opt/gatk/gatk GenotypeGVCFs \
-R $REFERENCE/Gasterosteus_aculeatus_gasAcu1.fasta \
-V gendb://$DATA_DIR/genomicsdb_array \
-G StandardAnnotation -newQual \
-O $OUTDIR/"$Raw_newname"

I have to use GATK in a singularity image on the cluster I have access to. To me, it's unclear what 'n' is referring to. Also, during the GenomicsDBImport step, I used the line below:

--genomicsdb-workspace-path $path/chr1/DE_M_chr1_DB \

Yet, there is no directory called "DE_M_chr1_DB", instead there is only a directory called genomicsdb_array, and 4 files. Is this correct?

Thanks for your help!

Tagged:

Best Answer

Answers

  • Hi @Sheila

    I realised that when I came in the next morning. I'm using GATK 4.0.2.1.

  • SheilaSheila Broad InstituteMember, Broadie admin

    @DMJThorburn
    Hi,

    Great. So, I am assuming things are working now?

    -Sheila

  • Hi,
    I also have the same error message(A USER ERROR has occurred: n is not a recognized option) and I am using gatk 4.0.5.1, my script:
    gatk GenotypeGVCFs \
    -R /ftp.broadinstitute.org/bundle/2.8/hg19/ucsc.hg19.fasta \
    -V gendb://chrUn_gl000249/genomicsdb_array \
    -G StandardAnnotation -newQual \
    -O vcf/chrUn_gl000249_output.vcf

    It will be greatly appreciated if anyone could help me out.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    @y3333gatk Use -new-qual instead of -newQual

  • Angry_PandaAngry_Panda Member
    edited August 2018

    dear GATK team,
    I tried to follow and modify the code with this.
    to make it can work with GATK4. But I got this fault report:
    A USER ERROR has occurred: An index is required but was not found for file drivingVariantFile:/Users/angry_panda/Documents/cromwell/jointCallingGenotypes2/cromwell-executions/jointCallingGenotypes/cbf1d4f5-9dcb-4acd-b3b8-4d45f5bd6fcd/call-GenotypeGVCFs/inputs/976973035/GVCF_cohort.g.vcf.gz. Support for unindexed block-compressed files has been temporarily disabled. Try running IndexFeatureFile on the input
    My code which may cause problems , I didn't show some declare part to make it shorter :

      call CombineGVCFs {
        input: GVCFs=HaplotypeCallerERC.GVCF,
        RefFasta=refFasta, 
        GATK=gatk, 
        RefIndex=refIndex, 
        RefDict=refDict, 
        sampleName="GVCF"  
      }
      call GenotypeGVCFs { 
        input: GVCF=CombineGVCFs.GVCF, 
            sampleName="CEUtrio", 
            RefFasta=refFasta, 
            GATK=gatk, 
            RefIndex=refIndex, 
            RefDict=refDict 
      }
    }
    task CombineGVCFs {
      command {
        java -jar ${GATK} \
            CombineGVCFs \
            -R ${RefFasta} \
            -V ${sep=" -V " GVCFs} \
            -O ${sampleName}_cohort.g.vcf.gz
       } 
       output {
        File GVCF = "${sampleName}_cohort.g.vcf.gz"
       }
     }
    task GenotypeGVCFs {
      command {
        java -jar ${GATK} \
            GenotypeGVCFs \
            -R ${RefFasta} \
            -V ${GVCF} \
            -O ${sampleName}_rawVariants.vcf
      }
      output {
        File rawVCF = "${sampleName}_rawVariants.vcf"
      }
    }
    
    Post edited by bshifaw on
  • bshifawbshifaw Member, Broadie, Moderator admin
    edited August 2018

    Hi @Angry_Panda

    There is already a joint genotyping workflow available that uses gatk4 in our gatk-workflow git organization. You can definitely use this instead of going through the hassle of making your own. Also you can visit this page to view all the workflow that we make available under the Reference Implementations subsection. And if you need help running the workflows you can check out this tutorial.

    The input variables inside your task blocks need to be defined. If you're new to WDL you may want to run through the Quick Start Guide
    example

    task GenotypeGVCFs {
    
      File GATK
      File RefFasta
      File RefIndex
      File RefDict
      String sampleName
      Array[File] GVCFs
    
      command {
        java -jar ${GATK} \
            -T GenotypeGVCFs \
            -R ${RefFasta} \
            -V ${sep=" -V " GVCFs} \
            -o ${sampleName}_rawVariants.vcf
      }
      output {
        File rawVCF = "${sampleName}_rawVariants.vcf"
      }
    } 
    
Sign In or Register to comment.