Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

error in genotyping on a single diploid genome

GangpaoGangpao PekingMember

Hi there!

i would like to run the whole pipeline on one sample for testing the pipeline.
here are some wrong in the genotyping step.

commands are:

Step9. Call Variants

java -Xmx5g -jar $gatk -T HaplotypeCaller -R $ref_dir/ucsc.hg19.fasta -I $bam_dir/T-SZ-03-1.dedupped.realigned.recal.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o $other_dir/sample.raw.snps.indels.vcf

Joint Genotyping

java -Xmx5g -jar $gatk -T GenotypeGVCFs -R $ref_dir/ucsc.hg19.fasta --variant $other_dir/sample.raw.snps.indels.vcf -o $other_dir/test.raw.snps.indels.g.vcf

Error in GenotypeGVCFs :

INFO 09:25:33,478 HelpFormatter - --------------------------------------------------------------------------------
INFO 09:25:33,480 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.3-0-g37228af, Compiled 2014/10/24 01:07:22
INFO 09:25:33,480 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 09:25:33,481 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 09:25:33,484 HelpFormatter - Program Args: -T GenotypeGVCFs -R /media/LAB636/01.06.job/Ref/ucsc.hg19.fasta --variant /media/LAB636/01.06.job/test/data/other/sample.raw.snps.indels.vcf -o /media/LAB636/01.06.job/test/data/other/test.raw.snps.indels.g.vcf
INFO 09:25:33,486 HelpFormatter - Executing as [email protected] on Linux 2.6.32-38-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_31-b13.
INFO 09:25:33,486 HelpFormatter - Date/Time: 2015/03/26 09:25:33
INFO 09:25:33,487 HelpFormatter - --------------------------------------------------------------------------------
INFO 09:25:33,487 HelpFormatter - --------------------------------------------------------------------------------
INFO 09:25:33,823 GenomeAnalysisEngine - Strictness is SILENT
INFO 09:25:33,893 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 09:25:34,936 GenomeAnalysisEngine - Preparing for traversal
INFO 09:25:34,943 GenomeAnalysisEngine - Done preparing for traversal
INFO 09:25:34,944 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 09:25:34,944 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 09:25:34,945 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime
INFO 09:25:35,005 GenotypeGVCFs - Notice that the -ploidy parameter is ignored in GenotypeGVCFs tool as this is automatically determined by the input variant files

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 3.3-0-g37228af):
ERROR
ERROR This means that one or more arguments or inputs in your command are incorrect.
ERROR The error message below tells you what is the problem.
ERROR
ERROR If the problem is an invalid argument, please check the online documentation guide
ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
ERROR
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
ERROR
ERROR MESSAGE:** The list of input alleles must contain as an allele but that is not the case at position 150; please use the Haplotype Caller with gVCF output to generate appropriate records**
ERROR ------------------------------------------------------------------------------------------

there is my confusion:
1. As you know, I run the pipeline on just one sample. Whether do I need to do joint genotyping after call variants with HC ?
2. The documents here mentions : Genotypes any number of gVCF files that were produced by the Haplotype Caller into a single joint VCF file. So is this means the input here should be gVCF files? But the output of HC are vcf file.
https://www.broadinstitute.org/gatk/guide/tooldocs/org_broadinstitute_gatk_tools_walkers_variantutils_GenotypeGVCFs.php
3.what leads to the error message above?

Thanks! :)

Tagged:

Best Answers

Answers

Sign In or Register to comment.