The current GATK version is 3.3-0

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# HaplotypeCaller gave different results if include --dbsnp argument

Hong KongPosts: 7Member
edited November 2013

Hi, I used following commands to call variants from exactly the same file by HaplotypeCaller. However, I got different results. The results from case1 are not consistent to case2. In some chromosomes, the numbers of variants in case 1 are more than case 2, but others are less. The differences are only a few variants in each chromosome. Any idea? I supposed they will be the same because just adding --dbsnp information.

======================================================================

${java7} -jar$GATK/GenomeAnalysisTK.jar \ -T HaplotypeCaller \ -R $reference_genome \ -I$input_file \ -L X \ --genotyping_mode DISCOVERY \ -stand_emit_conf 10 \ -stand_call_conf 50 \ -o vcf_out/chrX.vcf

=======================================================================

${java7} -jar$GATK/GenomeAnalysisTK.jar \ -T HaplotypeCaller \ -R $reference_genome \ -I$input_file \ -L X \ -nct 8 \ --dbsnp \$dbsnp \ --genotyping_mode DISCOVERY \ -stand_emit_conf 10 \ -stand_call_conf 50 \ -o vcf_out/chrX.vcf

Post edited by jacobhsu on
Tagged:

• Posts: 683GATK Developer mod

The difference has nothing to do with the --dbsnp argument but rather -nct; any parallelization makes the calling non-deterministic.

Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

• Hong KongPosts: 7Member

Do you mean if I conduct this analysis by case 2 again, I may get another different result ? In this scenario, how can we know which one is more reliable ?

• Hong KongPosts: 7Member

Dear ebanks,