Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Build the SNP recalibration model error

Hi,

I am trying to build the SNP recalibration model by running the following GATK command:

./gatk-4.0.3.0/gatk VariantRecalibrator \
-R human_g1k_v37_decoy.fasta \
-input /mergedFiles.vcf \
--resource hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.sites.vcf \
--resource omni,known=false,training=true,truth=false,prior=12.0 1000G_omni2.5.b37.sites.vcf \
--resource 1000G,known=false,training=true,truth=false,prior=10.0 1000G_phase1.snps.high_confidence.vcf \
--resource dbsnp,known=true,training=false,truth=false,prior=2.0 dbsnp_135.b37.vcf \
-an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR -an InbreedingCoeff \
-mode SNP \
-tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 \
--recalFile recalibrate_SNP.recal \
-tranchesFile output.tranches \
--rscriptFile output.plots.R

But I am getting following error.

Error:


A USER ERROR has occurred: Invalid argument 'hapmap_3.3.b37.sites.vcf'.


Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.

I have used the human_g1k_v37_decoy.fasta for alignment therefore, using the same for recalibration. I would like to convert raw variants to ready to analysis variant by applying filtration,and annotation. Please let me know if you have any direction for best practice approach.

Thanks

Issue · Github
by Sheila

Issue Number
3077
State
closed
Last Updated
Assignee
Array
Closed By
sooheelee

Best Answers

Answers

  • Hi,
    Thanks. hamper error has been removed but I am getting similar error with other resources i.e. 12.0 1000G_omni2.5.b37.vc, 10.0 1000G_phase1.snps.high_confidence.b37.vcf, 2.0 dbsnp_135.b37.vcf

    If I am using the following commend:
    --resource omni,known=false,training=true,truth=false,prior=12.0 1000G_omni2.5.b37.vcf \
    ERROR
    A USER ERROR has occurred: Invalid argument '1000G_omni2.5.b37.vcf'.

    Using following command:
    --resource omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.b37.vcf \
    Error:
    A USER ERROR has occurred: Argument resource has a bad value: omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.b37.vcf. Problem constructing FeatureInput from the string 'omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.b37.vcf'.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @sksinghal
    Hi,

    Can you try validating your input VCF with ValidateVariants? Also, can you try deleting the VCF index and re-generating it?

    Thanks,
    Sheila

  • Hello Sheila,

    Thanks for your response and help. My error is still unsolved.

    As per your suggestion, I have run the input VCF with ValidateVariants and following output shows the there is no problem with the vcf file.

    To regenerate the VCF index, one of the post on GATK shows "https://gatkforums.broadinstitute.org/gatk/discussion/5426/generate-an-idx-file-for-a-vcf" GATK will auto-generate an index of vcf file.

    20:32:09.929 INFO NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/Users/sandeepsinghal/Documents/Sequencing/Tools/gatk-4.0.3.0/gatk-package-4.0.3.0-local.jar!/com/intel/gkl/native/libgkl_compression.dylib
    20:32:10.089 INFO ValidateVariants - ------------------------------------------------------------
    20:32:10.089 INFO ValidateVariants - The Genome Analysis Toolkit (GATK) v4.0.3.0
    20:32:10.089 INFO ValidateVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
    20:32:10.089 INFO ValidateVariants - Executing as [email protected] on Mac OS X v10.10.5 x86_64
    20:32:10.089 INFO ValidateVariants - Java runtime: Java HotSpot(TM) 64-Bit Server VM v10+46
    20:32:10.090 INFO ValidateVariants - Start Date/Time: April 30, 2018 at 8:32:09 PM MDT
    20:32:10.090 INFO ValidateVariants - ------------------------------------------------------------
    20:32:10.090 INFO ValidateVariants - ------------------------------------------------------------
    20:32:10.091 INFO ValidateVariants - HTSJDK Version: 2.14.3
    20:32:10.091 INFO ValidateVariants - Picard Version: 2.17.2
    20:32:10.091 INFO ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    20:32:10.091 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    20:32:10.091 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    20:32:10.091 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    20:32:10.091 INFO ValidateVariants - Deflater: IntelDeflater
    20:32:10.091 INFO ValidateVariants - Inflater: IntelInflater
    20:32:10.091 INFO ValidateVariants - GCS max retries/reopens: 20
    20:32:10.092 INFO ValidateVariants - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
    20:32:10.092 INFO ValidateVariants - Initializing engine
    20:32:10.257 INFO FeatureManager - Using codec VCFCodec to read file file:///Users/sandeepsinghal/Documents/Sequencing/Reference/dbsnp_138.b37.vcf
    20:32:10.439 INFO FeatureManager - Using codec VCFCodec to read file file:///Users/sandeepsinghal/Documents/Sequencing/KevinProject/mergedFilesFilteredPass.vcf
    20:32:10.461 INFO ValidateVariants - Done initializing engine
    20:32:10.461 INFO ProgressMeter - Starting traversal
    20:32:10.461 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
    20:32:20.930 INFO ProgressMeter - 1:181144537 0.2 13000 74512.8
    20:32:30.996 INFO ProgressMeter - 2:135745545 0.3 24000 70127.6
    20:32:41.771 INFO ProgressMeter - 3:122439559 0.5 36000 68987.5
    20:32:52.593 INFO ProgressMeter - 4:174223282 0.7 46000 65508.4
    20:33:02.594 INFO ProgressMeter - 6:31802807 0.9 56000 64450.5
    20:33:13.085 INFO ProgressMeter - 7:74197992 1.0 66000 63234.5
    20:33:23.939 INFO ProgressMeter - 8:144657152 1.2 76000 62060.2
    20:33:34.100 INFO ProgressMeter - 10:78846389 1.4 87000 62411.1
    20:33:44.786 INFO ProgressMeter - 12:6219987 1.6 100000 63609.9
    20:33:55.235 INFO ProgressMeter - 14:19114712 1.7 111000 63565.4
    20:34:05.555 INFO ProgressMeter - 16:10575824 1.9 124000 64642.8
    20:34:16.464 INFO ProgressMeter - 18:21741429 2.1 140000 66665.1
    20:34:26.907 INFO ProgressMeter - 20:51871073 2.3 156000 68598.6
    20:34:36.362 INFO ProgressMeter - X:148436333 2.4 167667 68951.5
    20:34:36.362 INFO ProgressMeter - Traversal complete. Processed 167667 total variants in 2.4 minutes.
    20:34:36.362 INFO ValidateVariants - Shutting down engine
    [April 30, 2018 at 8:34:36 PM MDT] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 2.44 minutes.

    Please provide me if there is any other possibility.

    Thanks
    Sandeep

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin
    edited May 2018

    @sksinghal
    Hi Sandeep,

    Hmm. I don't know what is going on. I suspect it is some syntax error that I cannot spot. Let me ask someone on the team.

    In the meantime, have you tried deleting the VCF indices and letting the tool re-generate them? I think that is possible in GATK4. If not, can you use IndexFeatureFile?

    Thanks,
    Sheila

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @sksinghal,

    It appears you are using Java v10:

    Java HotSpot(TM) 64-Bit Server VM v10+46

    GATK4 requires Java v8. Please try with the correct version of Java and see if you still get the error. If you don't want to mess with your system's Java version, I recommend the GATK4 Docker, which will have the correct environment.

    The Downloads page gives a link to the GATK Docker repo and you can use Tutorial#11090 to get started with the GATK4 docker.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @sksinghal
    Hi again,

    Are you running VariantRecalibrator on a VCF from Mutect2? You should use FilterMutectCalls instead.

    -Sheila

  • Hi Sheila,

    Thanks again for your help. I used the FilterMutechCalls with just a merged vcf files but my output file is empty.

    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -jar /Users/sandeepsinghal/Documents/Sequencing/Tools/gatk-4.0.2.1/gatk-package-4.0.2.1-local.jar FilterMutectCalls -V /mergedFilesFilteredPass.vcf -O /mergedFilesFilteredPass2.vcf
    

    12:21:25.885 INFO NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/Users/sandeepsinghal/Documents/Sequencing/Tools/gatk-4.0.2.1/gatk-package-4.0.2.1-local.jar!/com/intel/gkl/native/libgkl_compression.dylib
    12:21:27.171 INFO FilterMutectCalls - ------------------------------------------------------------
    12:21:27.171 INFO FilterMutectCalls - The Genome Analysis Toolkit (GATK) v4.0.2.1
    12:21:27.171 INFO FilterMutectCalls - For support and documentation go to https://software.broadinstitute.org/gatk/
    12:21:27.172 INFO FilterMutectCalls - Executing as [email protected] on Mac OS X v10.10.5 x86_64
    12:21:27.172 INFO FilterMutectCalls - Java runtime: Java HotSpot(TM) 64-Bit Server VM v10+46
    12:21:27.172 INFO FilterMutectCalls - Start Date/Time: May 9, 2018 at 12:21:25 PM MDT
    12:21:27.172 INFO FilterMutectCalls - ------------------------------------------------------------
    12:21:27.172 INFO FilterMutectCalls - ------------------------------------------------------------
    12:21:27.173 INFO FilterMutectCalls - HTSJDK Version: 2.14.3
    12:21:27.173 INFO FilterMutectCalls - Picard Version: 2.17.2
    12:21:27.173 INFO FilterMutectCalls - HTSJDK Defaults.COMPRESSION_LEVEL : 1
    12:21:27.173 INFO FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    12:21:27.173 INFO FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    12:21:27.173 INFO FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    12:21:27.173 INFO FilterMutectCalls - Deflater: IntelDeflater
    12:21:27.173 INFO FilterMutectCalls - Inflater: IntelInflater
    12:21:27.173 INFO FilterMutectCalls - GCS max retries/reopens: 20
    12:21:27.173 INFO FilterMutectCalls - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
    12:21:27.173 INFO FilterMutectCalls - Initializing engine
    12:21:27.329 INFO FeatureManager - Using codec VCFCodec to read file file:///Users/sandeepsinghal/Documents/Sequencing/SysPiplineR/VAR-Seq/KalvinData/data/mergedFilesFilteredPass.vcf
    12:21:27.349 INFO FilterMutectCalls - Done initializing engine
    12:21:27.384 INFO FilterMutectCalls - Shutting down engine
    [May 9, 2018 at 12:21:27 PM MDT] org.broadinstitute.hellbender.tools.walkers.mutect.FilterMutectCalls done. Elapsed time: 0.03 minutes.
    Runtime.totalMemory()=322961408
    java.lang.NullPointerException
    at org.broadinstitute.hellbender.tools.walkers.mutect.FilterMutectCalls.onTraversalStart(FilterMutectCalls.java:101)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:891)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:135)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:180)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:199)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:159)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:202)
    at org.broadinstitute.hellbender.Main.main(Main.java:288)

    The output is attached fro your review. Please let me know what is wrong with my approach.

    Thanks
    Sandeep

  • Hi Sheila,
    Although this question may has been asked, if you do not mind asking it again, as I am looking for some workflows, strategies, ideas for filter and annotate of MUTECT2 generated vcf files. Please provide me some tutorial or directions.

    Thanks
    Sandeep

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @sksinghal
    Hi Sandeep,

    Can you try Soo Hee's suggestion and change to Java 8? This may be causing the issue, as Java 10 is not supported.

    For Mutect2 tutorials, have a look here and in the Presentations section.

    -Sheila

  • Hi Sheila,

    I'll try your suggestion and get back to you.

    Thanks
    Sandeep

  • umauma virginiaMember

    Hi,

    I am running GATK 4.0.4 and having a similar issue for recalibration. My command is as below and the $variables are pre-defined:
    cd ${FINAL}; ${GATK} VariantRecalibrator -R ${REF} -V AM.genotype.vcf.gz \
    --resource omni,known=false,training=true,truth=true,prior=12.0 ${BUNDLE}/1000G_omni2.5.b37.vcf \
    --resource dbsnp,known=true,training=false,truth=false,prior=2.0 ${BUNDLE}/dbsnp_138.b37.vcf \
    -an DP \
    -an QD \
    -an FS \
    -an SOR \
    -an MQ \
    -an MQRankSum \
    -an ReadPosRankSum \
    -mode SNP \
    -tranche 100.0 \
    -tranche 99.5 \
    -tranche 99.75 \
    --output recalibrate_SNP.recal \
    --tranches-file recalibrate_SNP.tranches \
    --rscript-file recalibrate_SNP_plots.R

    The error is:


    A USER ERROR has occurred: Invalid argument 'AM.genotype.vcf.gz'.


    The vcf was generated using CombineGVCFs followed by joint genotyping and I had also cross checked with ValidateVariants. Could you suggest what might be the problem.

    Thanks,
    Uma

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @uma
    Hi Uma,

    Hmm. This usually occurs when the spaces are being misread in the command line. Are you copying and pasting this command from somewhere else? If so, can you try typing it out?

    Thanks,
    Sheila

  • sormondsormond Member

    Hi,

    I am having the same problem as sksinghal:

    "A USER ERROR has occurred: Argument resource has a bad value: omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf.gz. Problem constructing FeatureInput from the string 'omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf.gz"

    Running command:

    gatk VariantRecalibrator -R ~/MH/reference/GATK_Bundle/bwa/Homo_sapiens_assembly38.fasta -V ~/MH/A/BWA/GATK/HaplotypeCaller/Combined/variants.vcf --resource hapmap,known=false,training=true,truth=true,prior=15.0:hapmap_3.3.hg38.vcf.gz --resource omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf.gz --resource 1000G,known=false,training=true,truth=false,prior=10.0:1000G_phase1.snps.high_confidence.hg38.vcf.gz --resource dbsnp,known=true,training=false,truth=false,prior=2.0:dbsnp_146.hg38.vcf.gz -an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR -mode SNP --output ~/MH/A/BWA/GATK/HaplotypeCaller/Combined/VariantRecalibrator/A.recal --tranches-file ~/MH/A/BWA/GATK/HaplotypeCaller/Combined/VariantRecalibrator/A.tranches --rscript-file ~/MH/A/BWA/GATK/HaplotypeCaller/Combined/VariantRecalibrator/A.plots.R

    I ran ValidateVariants as suggested:

    11:00:38.936 INFO ValidateVariants - Done initializing engine
    11:00:38.936 INFO ProgressMeter - Starting traversal
    11:00:38.937 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
    11:00:48.949 INFO ProgressMeter - chr1:43590791 0.2 12000 71920.9
    11:00:59.452 INFO ProgressMeter - chr1:102919168 0.3 19000 55569.1
    11:01:09.807 INFO ProgressMeter - chr1:183215293 0.5 29000 56365.4
    11:01:21.274 INFO ProgressMeter - chr1:246688787 0.7 38000 53854.9
    11:01:32.983 INFO ProgressMeter - chr2:55244246 0.9 46000 51067.6
    11:01:44.609 INFO ProgressMeter - chr2:125050139 1.1 54000 49336.1
    11:01:55.196 INFO ProgressMeter - chr2:186625661 1.3 61000 47994.3
    11:02:06.895 INFO ProgressMeter - chr3:9122413 1.5 71000 48432.2
    11:02:18.064 INFO ProgressMeter - chr3:66383941 1.7 79000 47817.4
    11:02:28.683 INFO ProgressMeter - chr3:129504134 1.8 86000 47017.7
    11:02:39.037 INFO ProgressMeter - chr3:188762617 2.0 92000 45961.7
    11:02:49.495 INFO ProgressMeter - chr4:43943462 2.2 100000 45956.6
    11:02:59.721 INFO ProgressMeter - chr4:104735109 2.3 106000 45175.9
    11:03:11.460 INFO ProgressMeter - chr4:174766987 2.5 112000 44058.9
    11:03:22.010 INFO ProgressMeter - chr5:41770571 2.7 119000 43784.1
    11:03:34.032 INFO ProgressMeter - chr5:117607701 2.9 125000 42834.1
    11:03:44.472 INFO ProgressMeter - chr5:174650847 3.1 132000 42687.4
    11:03:55.206 INFO ProgressMeter - chr6:50625835 3.3 141000 43104.1
    11:04:06.627 INFO ProgressMeter - chr6:119961200 3.5 147000 42467.1
    11:04:17.131 INFO ProgressMeter - chr7:4361919 3.6 155000 42622.8
    11:04:27.741 INFO ProgressMeter - chr7:61382683 3.8 163000 42744.0
    11:04:38.402 INFO ProgressMeter - chr7:123129324 4.0 171000 42845.5
    11:04:48.577 INFO ProgressMeter - chr8:12185312 4.2 179000 43022.0
    11:05:00.242 INFO ProgressMeter - chr8:75593324 4.4 186000 42708.7
    11:05:11.403 INFO ProgressMeter - chr8:141306208 4.5 192000 42280.5
    11:05:21.510 INFO ProgressMeter - chr9:67718957 4.7 200000 42466.9
    11:05:32.540 INFO ProgressMeter - chr9:128068097 4.9 208000 42506.5
    11:05:43.523 INFO ProgressMeter - chr10:43119934 5.1 219000 43140.7
    11:05:54.248 INFO ProgressMeter - chr10:102617189 5.3 228000 43385.7
    11:06:04.549 INFO ProgressMeter - chr11:18863171 5.4 238000 43855.9
    11:06:15.304 INFO ProgressMeter - chr11:76695297 5.6 247000 44059.0
    11:06:26.469 INFO ProgressMeter - chr12:926781 5.8 255000 44024.7
    11:06:36.568 INFO ProgressMeter - chr12:55427174 6.0 265000 44459.4
    11:06:46.670 INFO ProgressMeter - chr12:113725883 6.1 272000 44380.0
    11:06:57.630 INFO ProgressMeter - chr13:54121031 6.3 282000 44680.0
    11:07:07.827 INFO ProgressMeter - chr13:113243006 6.5 287000 44279.9
    11:07:18.046 INFO ProgressMeter - chr14:72729327 6.7 295000 44348.8
    11:07:28.053 INFO ProgressMeter - chr15:41895294 6.8 304000 44583.9
    11:07:39.257 INFO ProgressMeter - chr15:99131555 7.0 314000 44823.0
    11:07:49.800 INFO ProgressMeter - chr16:56969811 7.2 324000 45118.8
    11:08:00.251 INFO ProgressMeter - chr17:9921752 7.4 335000 45545.8
    11:08:11.072 INFO ProgressMeter - chr17:73227904 7.5 346000 45915.5
    11:08:21.831 INFO ProgressMeter - chr18:50126767 7.7 356000 46144.5
    11:08:32.135 INFO ProgressMeter - chr19:17250085 7.9 367000 46534.4
    11:08:43.143 INFO ProgressMeter - chr20:9563034 8.1 382000 47335.2
    11:08:53.400 INFO ProgressMeter - chr21:8991233 8.2 392000 47566.8
    11:09:03.456 INFO ProgressMeter - chr22:30525489 8.4 404000 48045.8
    11:09:15.823 INFO ProgressMeter - chrX:90716091 8.6 414000 48057.0
    11:09:21.795 INFO ProgressMeter - chrUn_JTFH01001512v1_decoy:871 8.7 429348 49269.4
    11:09:21.795 INFO ProgressMeter - Traversal complete. Processed 429348 total variants in 8.7 minutes.
    11:09:21.795 INFO ValidateVariants - Shutting down engine
    [5 July 2018 11:09:21 AM] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 8.76 minutes.

    I am using Java8, GATK4.

    Not sure where to go from here.

    Kind Regards,

    Shannon

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @sormond,

    Your error:

    A USER ERROR has occurred: Argument resource has a bad value: omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf.gz. Problem constructing FeatureInput from the string 'omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf.gz
    

    Indicates an issue or absence of a VCF index. Please generate or regenerate your indices with IndexFeatureFile.

  • sormondsormond Member

    Hi @shlee,

    I have an index for my vcf generated in the previous step (GenotypeGVCFs). I have regenerated the index using IndexFeatureFile and still get this exact error. I have also done this for all resource VCFs. I don't know what to do now...

    Kind Regards,

    Shannon

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi @sormond, what happens if you exclude that one resource from your command? Does it run or error out on the other resources?

  • sormondsormond Member

    Hi @Geraldine_VdAuwera

    It produces the same error message but for the 1000G phase 1 file:

    A USER ERROR has occurred: Argument resource has a bad value: 1000G,known=false,training=true,truth=false,prior=10.0:1000G_phase1.snps.high_confidence.hg38.vcf.gz. Problem constructing FeatureInput from the string '1000G,known=false,training=true,truth=false,prior=10.0:1000G_phase1.snps.high_confidence.hg38.vcf.gz'.

    When I remove that 1000G phase 1 file, VariantRecalibrator works then (at least it starts fine, haven't completely run the process obviously). So there seems to be a problem with the two files.

    Just to be clear, I again freshly downloaded the files just now and am still having this problem. I thought it may be due to the fact i downloaded the files onto a Mac and then transferred to a linux system, but I have ruled this out as I directly downloaded onto the linux system from your GATK bundle (ftp://ftp.broadinstitute.org/bundle/hg38/).

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Ok — try decompressing the files so they’re regular vcf instead of gz.

  • sormondsormond Member

    Hi @Geraldine_VdAuwera ,

    When i decompress both those 1000G resource files, I get the same message:

    A USER ERROR has occurred: Argument resource has a bad value: omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf. Problem constructing FeatureInput from the string 'omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf'.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Huh. The only other thing I can think of is what Sheila advised another user earlier in this thread — try retyping the command from scratch (don’t just copy/paste it) in case there’s a weird character that’s somehow screwing up the string parsing. And to verify orthogonally that that’s the case (rather than an actual problem with the files) you can try swapping out the filenames to only use those that work when you remove the two lines.

  • sormondsormond Member

    @Geraldine_VdAuwera ,

    I have tried both of those things, and checked the code multiple times. Just re-typed again and same error message.

    Kind Regards,

    Shannon

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @sormond
    Hi Shannon,

    I am not sure what is going on here. Let me ask someone else and get back to you.

    -Sheila

  • ahmed_chakrounahmed_chakroun TunisiaMember

    Hi,

    I have the exact same problem with both files used within theses options:

    --resource omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf.gz --resource 1000G,known=false,training=true,truth=false,prior=10.0:1000G_phase1.snps.high_confidence.hg38.vcf.gz

    knowing that I have already tried all the suggestions (IndexFeatureFile, gunzip files, etc).

    Otherwise, VariantRecalibrator seems to work well once these options are simply removed. However, I am not sure this is recommended to keep running GATK within the good practices framework for human WES analysis, isn't it?

    Regards.

    Ahmed

  • ahmed_chakrounahmed_chakroun TunisiaMember

    Hi,

    I have run the full command with --java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true' and here is the stdout

    ***********************************************************************
    
    A USER ERROR has occurred: Argument resource has a bad value: omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf.gz. Problem constructing FeatureInput from the string 'omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf.gz'.
    
    ***********************************************************************
    org.broadinstitute.barclay.argparser.CommandLineException$BadArgumentValue: Argument resource has a bad value: omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf.gz. Problem constructing FeatureInput from the string 'omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf.gz'.
            at org.broadinstitute.barclay.argparser.CommandLineArgumentParser.constructFromString(CommandLineArgumentParser.java:1124)
            at org.broadinstitute.barclay.argparser.CommandLineArgumentParser.setArgument(CommandLineArgumentParser.java:681)
            at org.broadinstitute.barclay.argparser.CommandLineArgumentParser.parseArguments(CommandLineArgumentParser.java:427)
            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.parseArgs(CommandLineProgram.java:221)
            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:195)
            at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
            at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
            at org.broadinstitute.hellbender.Main.main(Main.java:289)
    

    I hope this helps.

    Regards.

    Ahmed

  • ahmed_chakrounahmed_chakroun TunisiaMember

    Hi,

    I think, it is about Java version, running GATK in a docker container solved the '-resource' issue for me following just what was already suggested by shlee here

    Regards.

    Ahmed

  • faust.rmpfaust.rmp GermanyMember

    @shlee said:
    Hi @sksinghal,

    It appears you are using Java v10:

    Java HotSpot(TM) 64-Bit Server VM v10+46

    GATK4 requires Java v8. Please try with the correct version of Java and see if you still get the error. If you don't want to mess with your system's Java version, I recommend the GATK4 Docker, which will have the correct environment.

    The Downloads page gives a link to the GATK Docker repo and you can use Tutorial#11090 to get started with the GATK4 docker.

    Hi, I have the same problem with the following Java version:
    15:44:44.691 INFO IndexFeatureFile - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_181-b13

    I have all feature file indices (tbi) and the VCF file passes validation.

    Any suggestion?

    Thanks in advance!

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @faust.rmp,

    Can you post the command that is causing the error as well as the exact error message? Also, please tell us the version of GATK you are running.

    If your error is the same as one above, please note Sheila's comment that such errors typically indicate typos, e.g. spacing errors. If you feel your spacing is correct, the only thing I can think of is for you to type out the command instead of copy-pasting if indeed you are copy-pasting the command. Some programs convert between dashes and hyphens and this can trip GATK tools.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin
    edited August 2018

    Hi everyone,

    Can you also try setting the priors as ints (e.g. 15, 12, 10, 7)? It looks like some of you specified doubles (e.g. 15.0, 12.0, 10.0, 8.0). Not sure if this will help, but it is worth a try.

    -Sheila

  • nagolinagoli BerlinMember

    Is there any solution to it? I have the same problem with the same files from the resource bundle. I did everything that was recommended like everyone else, I tried different versions of GATK, I tried running it in a docker container...all to no avail.

    Regards

    Oliver

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi Oliver @nagoli,

    This thread is rather long and hard to follow. Can you repost your question to a brand new thread and provide details of what exactly is your problem (error message, exact command you used) and which version of GATK you are using.

  • merajmeraj IndiaMember

    @ahmed_chakroun said:
    Hi,

    I have the exact same problem with both files used within theses options:

    --resource omni,known=false,training=true,truth=false,prior=12.0:1000G_omni2.5.hg38.vcf.gz --resource 1000G,known=false,training=true,truth=false,prior=10.0:1000G_phase1.snps.high_confidence.hg38.vcf.gz

    knowing that I have already tried all the suggestions (IndexFeatureFile, gunzip files, etc).

    Otherwise, VariantRecalibrator seems to work well once these options are simply removed. However, I am not sure this is recommended to keep running GATK within the good practices framework for human WES analysis, isn't it?

    Regards.

    Ahmed

    Hi,
    I am also getting similar error message. However the first part was solved by changing the file name from 1000G_omni2.5.hg38.vcf.gz to omni2.5.hg38.vcf.gz. After this I am getting the error for 1000G_phase1.snps.high_confidence.hg38.vcf.gz file but not for "omni" file.

    Does changing name works for you as well?
    regards,
    Meraj

  • merajmeraj IndiaMember

    @Geraldine_VdAuwera
    Hi,
    I changed the filename prefix 1000G with thouzndG in the command (guessed cz syntax error), and now it works.
    Thanks.
    Best,
    Meraj

  • MehulSMehulS Member
    edited January 22

    I'm getting similar errors as above using the -D option in genotypeGVCFs my command was -D "/media/Seagate Backup Plus Drive/dbsnp.vcf.gz"

    It says: ` Argument dbSNP has a bad value: /media/Seagate Backup Plus Drive/dbsnp.vcf.gz. Problem constructing FeatureInput from the string '/media/Seagate Backup Plus Drive/dbsnp.vcf.gz'

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    HI @MehulS

    Please post the exact command, the version of gatk you are using and the entire error log.

  • wangchengshiwangchengshi Member
    # SNP VQSR
    /trainee/wes20190401/bin/gatk-4.1.0.0/gatk-4.1.0.0/gatk VariantRecalibrator \
    -R /trainee/ref/Homo_sapiens_assembly38.fasta \
    -V /trainee/ckzhu/wes/L01501/V100004251L01501/gatk/V100004251L01501.HC.vcf.gz \
    --resource hapmap,known=false,training=true,truth=true,prior=15.0:/trainee/ref/hapmap_3.3.hg38.vcf \
    --resource omni,known=false,training=true,truth=false,prior=12.0:/trainee/ref/1000G_omni2.5.hg38.vcf \
    --resource 1000G,known=false,training=true,truth=false,prior=10.0:/trainee/ref/1000G_phase1.snps.high_confidence.hg38.vcf \
    --resource dbsnp,known=true,training=false,truth=false,prior=2.0:/trainee/ref/dbsnp_146.hg38.vcf \
    -an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR \
    -mode SNP \
    -O /trainee/ckzhu/wes/L01501/gatk/V100004251L01501.snp.recal \
    --tranches-file /trainee/ckzhu/wes/L01501/gatk/V100004251L01501.snp.tranches \
    --rscript-file /trainee/ckzhu/wes/L01501/gatk/V100004251L01501.snp.plots.R


    A USER ERROR has occurred: Couldn't read file file:///trainee/ref/1000G_omni2.5.hg38.vcf.gz. Error was: It doesn't exist.
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
Sign In or Register to comment.