The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ` ) each to make a code block as demonstrated here.

GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

# some genotype calls are wrong, why?

Member
edited October 2012

Hello the team,

For some genotypes, it seems are wrong, I know it's model based, and base q, map q, etc are considered in the model.
But my case are special,
the format is (ref allele count)/(alternative allele count) genotype call:
22/24 0/0
109/125 0/0
85/109 0/0
26/32 0/0
40/161 0/0
195/6 1/1
239/5 1/1
83/6 1/1
46/28 1/1

In one case, the two variants are adjacent to each other.
In some case, they are one base indels.

Thanks,

Jim

Post edited by Geraldine_VdAuwera on
Tagged:

Hi Jim,

Can you give some more supporting info? For example, some IGV or pileup screenshots that show the reads and base calls at those loci? Without seeing more it's impossible to interpret the genotype numbers.

• Member

Hi Geraldine,
Thanks. Here is the genotype call and attached is the IGV snapshot

chr10:124329672 T T/T
chr10:124329673 C C/C

I can email you a small slice of the BAM file, since it has some meta info, I do not want to post it here.

Thank you,
-Jim

What is your command-line for calling? Are you possibly running with the generalized ploidy model and asking to genotype a ploidy of 1?

• Member

Eric,
Thank you. I am using genotype_likelihoods_model, here is the cmd:
java -Xmx4g -jar GATK
-T UnifiedGenotyper -nt 8 -R HG19REF -I test.bam -o test.vcf
-metrics OUTPUT/test.txt -dcov 250
-L chr10 -glm BOTH -stand_call_conf 30 -stand_emit_conf 30
-baq CALCULATE_AS_NECESSARY
-G Standard -A AlleleBalance -A DepthOfCoverage
-A HomopolymerRun -A QualByDepth

-Jim

• Member

The version is version 1.6-11-g3b2fab9 on Linux 64

Hi Jim,

I would try updating to the latest version of the code and turning off BAQ (which has been known to kill multi-nucleotide polymorphisms). If that doesn't work, it would be good to see what the HaplotypeCaller calls at this position.

• Member

Eric,
Thanks a lot! That's exactly the cause. Can you give me a clue about the effect on false positive calls if I turn it off? To my understanding (please correct me if I am wrong or incomplete), BAQ was introduced to reduce false positive calls around alignment induced indels. I wonder how much benefits it has.
-Jim

BAQ has tons of benefit for regular SNPs, but it tends to kill more complex events like MNPs (as you saw firsthand) and indels. This is precisely why we have developed the HaplotypeCaller as the successor to the UnifiedGenotyper.

• Member

Eric,
Thanks for the insights.
So HaplotypeCaller kept the benefits and eliminated the drawbacks on complex events? Is 1000genome project using HaplotypeCaller?
I'd like to know more about it. Could you point me to some docs or presentations about it, or the comparison of it and UnifiedGenotyper.
What is the major difference?
Thanks,
-Jim

• Member

thanks!