We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

A USER ERROR has occurred: v is not a recognized option

I came across this error when I collate all the normal VCFs into a single callset with CreateSomaticPanelOfNormals
Syntax as "Java -jar -Xmx16g ./gatk-package-4.1.2.0-local.jar CreateSomaticPanelOfNormals \
-vcfs tutorial_11136/3_HG00190.vcf.gz \
-vcfs tutorial_11136/4_NA19771.vcf.gz \
-vcfs tutorial_11136/5_HG02759.vcf.gz \
-O 6_threesamplepon.vcf.gz"
I tried in Cygwin and Powershell in my PC.
Any suggestion?

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
  • SSH123SSH123 Member
    Then, I tried "Java -jar -Xmx16g ./gatk-package-4.1.2.0-local.jar CreateSomaticPanelOfNormals \
    -V tutorial_11136/3_HG00190.vcf.gz \
    -V tutorial_11136/4_NA19771.vcf.gz \
    -V tutorial_11136/5_HG02759.vcf.gz \
    -O 6_threesamplepon.vcf.gz"

    A USER ERROR has occurred: Argument '[V, variant]' cannot be specified more than once.

    Or "Java -jar -Xmx16g ./gatk-package-4.1.2.0-local.jar CreateSomaticPanelOfNormals -V tutorial_11136/3_HG00190.vcf.gz tutorial_11136/4_NA19771.vcf.gz tutorial_11136/5_HG02759.vcf.gz -O 6_threesamplepon.vcf.gz"

    A USER ERROR has occurred: Invalid argument 'tutorial_11136/4_NA19771.vcf.gz'.

    Any idea?
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @SSH123

    As mentioned in the tool docs here, There are three steps to the workflow:

    Step 1. Run Mutect2 in tumor-only mode for each normal sample.
    Step 2. Create a GenomicsDB from the normal Mutect2 calls.
    Step 3. Combine the normal calls using CreateSomaticPanelOfNormals.

    >

    Please follow all three steps as shown in the tool docs to create the panel of normals. Looks like you skipped the GenomicsDB(step2) step.

  • SSH123SSH123 Member
    #Code: gatk GenomicsDBImport -R human_g1k_v37.fasta -L Broad.human.exome.b37.interval_list \
    --genomicsdb-workspace-path 40pon_db \
    -V PON/1_B0026.vcf.gz \
    -V PON/1_B0098.vcf.gz \

    Two questions: 1) A USER ERROR has occurred: Bad input: GenomicsDBImport does not support GVCFs with MNPs. MNP found at 1:12854139 in VCF /data/cephfs/punim0912/PON/1_S1621.vcf.gz. All 1_xxxxx.vcf.gz was created by tumor only mode as Step 1. How to solve this error? 2) -L Broad.human.exome.b37.interval_list, my data is WGS, will this exome interval cut off my non-exome regions? Is there any WGS interval file for b37? Thanks!
  • SSH123SSH123 Member
    I've extracted the line from 1_S1621.vcf.gz as: 1 12854139 . CC TG . . DP=63;ECNT=4;MBQ=38,39;MFRL=234,297;MMQ=47,24;MPOS=7;POPAF=7.30;TLOD=3.94 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:59,2:0.047:61:38,1:20,1:37,22,2,0. I'll re-run step 1 for this particular file (BAM to VCF by tumor only mode) and see if it works. Thanks!
  • SSH123SSH123 Member
    Still the same error at the same location. Any idea?
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @SSH123

    This a known issue with genomicsdbimport. You must use -max-mnp-distance 0 in the mutect2 step.
    Take a look at this thread and gatk docs for more info: https://gatkforums.broadinstitute.org/gatk/discussion/23914/pon-mutect2-include-mnps-and-crash-genomicsdbimport#latest
    https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_hellbender_tools_walkers_mutect_CreateSomaticPanelOfNormals.php
    https://software.broadinstitute.org/gatk/documentation/article?id=24057

    Note: We recommend searching for errors in the forum to see if a solution has already been provided in other threads before posting a new question.

  • SSH123SSH123 Member
    #gatk GenomicsDBImport ran out of memory error.
    Hi, I was trying to put 20 vcf files together to get a panel of normal using code below:
    gatk GenomicsDBImport -R human_g1k_v37.fasta -L Broad.human.exome.b37.interval_list --genomicsdb-workspace-path pon_db -V foo1 ... -V foo20

    #And I gave it 80G memory as:
    #SBATCH --nodes=2
    #SBATCH --ntasks-per-node=4
    #SBATCH --mem-per-cpu=10G

    #It looks still not enough:
    kernel: Task in /job_12183650/step_batch killed as a result of limit of /job_12183650/step_batch
    kernel: memory: usage 41943040kB, limit 41943040kB, failcnt 0
    kernel: memory+swap: usage 41943040kB, limit 41943040kB, failcnt 1525142
    kernel: kmem: usage 95668kB, limit 9007199254740988kB, failcnt 0
    kernel: Memory cgroup stats for /job_12183650/step_batch: cache:12008KB rss:41837448KB rss_huge:7421952KB shmem:0KB mapped_file:264KB dirty:3828KB writeback:1584KB swap:0KB inactive_anon:3513436KB active_anon:38323952KB inactive_file:5212KB active_file:4772KB unevictable:0KB

    Inside the folder of pon_db, there are huge amount of "1$100111872$100111920 $166944458$166944507 ..." and then files named callset.json; __tiledb_workspace.tdb; vcfheader.vcf; vidmap.json. Is this normal or something wrong with the code? Thanks for your help!
  • bshifawbshifaw Member, Broadie, Moderator admin

    Hi @SSH123,

    I found the following on a different workflow using the same tool, try specifying the memory in your command and make sure its ~10 Gb lower then the max memory of the server.

    # The memory setting here is very important and must be several GB lower
    # than the total memory allocated to the VM because this tool uses
    # a significant amount of non-heap memory for native libraries.
    # Also, testing has shown that the multithreaded reader initialization
    # does not scale well beyond 5 threads, so don't increase beyond that.
    
    gatk --java-options "-Xmx4g -Xms4g" GenomicsDBImport -R human_g1k_v37.fasta -L Broad.human.exome.b37.interval_list --genomicsdb-workspace-path pon_db -V foo1 ... -V foo20
    
Sign In or Register to comment.