output_mode option in HaplotypeCaller using gVCF mode

Lisa0508Lisa0508 Ann Arbor, MIMember

Dear GATK team,
I wish to get gVCF files for each data set. But I am not sure if I should still use --output_mode EMIT_ALL_SITES argument in my command lines. In your previous thread, I found you mentioned that "HaplotypeCaller used to have that option, but it was removed when we introduced the reference model (gVCF) option. Have a look at the documentation that explains this here: http://www.broadinstitute.org/gatk/guide/article?id=2940". I clicked in the link. But the link was not accessible. So I wish to confirm if I am using the right arguments in my command. Here is my command. I have removed the --output_mode option. Will that be all right?
java -Xmx12g -jar $GATK_JARS/GenomeAnalysisTK.jar \
-T HaplotypeCaller \
-R ucsc.hg19.fasta \
-I sample1.realigned.dedup.sorted.bam \
--genotyping_mode DISCOVERY \
-stand_emit_conf 10 \
-stand_call_conf 20 \
--emitRefConfidence GVCF \
--variant_index_type LINEAR \
--variant_index_parameter 128000 \
-o raw_var_sample1.g.vcf

Best Answer


  • tommycarstensentommycarstensen United KingdomMember

    @Lisa0508 I'm a user like you. The HC documentation doesn't actually say that the default mode is EMIT_VARIANTS_ONLY. It's safe for you to leave --output_mode EMIT_ALL_SITES out. I'm not sure how it works with --emitRefConfidence GVCF anyway. Best of luck and have fun with GATK and your variant calling.

Sign In or Register to comment.