To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at


when I run
java -jar /data/software/GATK/GenomeAnalysisTK.jar -I sample_1_marked.bam -R /data/scratch/mxiong/ref/NCBI_GRCh38/genome.fa -T BaseRecalibrator -knownSites /data/scratch/mxiong/ref/NCBI_GRCh38/All_20150114.vcf -o sample_1_BaseRecalibrator.table

I get the error:
INFO 09:44:18,529 HelpFormatter - --------------------------------------------------------------------------------
INFO 09:44:18,531 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.6-5-gba531bd, Compiled 2013/07/18 18:05:31
INFO 09:44:18,531 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 09:44:18,531 HelpFormatter - For support and documentation go to
INFO 09:44:18,534 HelpFormatter - Program Args: -I sample_1_marked.bam -R /data/scratch/mxiong/ref/NCBI_GRCh38/genome.fa -T BaseRecalibrator -knownSites /data/scratch/mxiong/ref/NCBI_GRCh38/All_20150114.vcf -o sample_1_BaseRecalibrator.table
INFO 09:44:18,534 HelpFormatter - Date/Time: 2015/01/30 09:44:18
INFO 09:44:18,534 HelpFormatter - --------------------------------------------------------------------------------
INFO 09:44:18,535 HelpFormatter - --------------------------------------------------------------------------------
INFO 09:44:18,544 ArgumentTypeDescriptor - Dynamically determined type of /data/scratch/mxiong/ref/NCBI_GRCh38/All_20150114.vcf to be VCF
INFO 09:44:18,586 GenomeAnalysisEngine - Strictness is SILENT
INFO 09:44:18,695 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO 09:44:18,701 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 09:44:18,729 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.03
INFO 09:44:18,738 RMDTrackBuilder - Creating Tribble index in memory for file /data/scratch/mxiong/ref/NCBI_GRCh38/All_20150114.vcf
WARN 09:44:23,631 RestStorageService - Error Response: PUT '/GATK_Run_Reports/' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 1354, Content-MD5: bjETF8CVdO+JCA4f863abA==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: 6e311317c09574ef89080e1ff3adda6c, Date: Fri, 30 Jan 2015 15:44:23 GMT, Authorization: AWS AKIAIMHBU7X642TCHQ2A:jnCxXiM7CxJeA+5g1YqKx7Fkp9w=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-358.14.1.el6.x86_64; amd64; en; JVM 1.7.0_25), Host:, Expect: 100-continue], Response Headers: [x-amz-request-id: 79AFC335EE127933, x-amz-id-2: V5djJVO8pbxaphF/VMxfYumCS+Z/lAkI5A5jNWoDrzMtCi7xLdCPifIxyBlfuUci+7OWttT98gg=, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Fri, 30 Jan 2015 15:44:23 GMT, Connection: close, Server: AmazonS3]

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 2.6-5-gba531bd):
ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
ERROR Please do not post this error to the GATK forum
ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions
ERROR MESSAGE: The provided VCF file is malformed at approximately line number 622846: Duplicate allele added to VariantContext: T
ERROR ------------------------------------------------------------------------------------------

I used
Genome assembly:
Human dbSNP Build 142 data:

Do those files fit GATK analysis directly?

Best Answer


Sign In or Register to comment.