GATK licensing moves to direct-through-Broad model -- read about it on the GATK blog

some genotype calls are wrong, why?

jimmikeselfjimmikeself Posts: 10Member
edited October 2012 in Ask the GATK team

Hello the team,

For some genotypes, it seems are wrong, I know it's model based, and base q, map q, etc are considered in the model.
I also read this link:
http://gatkforums.broadinstitute.org/discussion/1235/why-didnt-the-unified-genotyper-call-my-snp-i-can-see-it-right-there-in-igv#latest
But my case are special,
the format is (ref allele count)/(alternative allele count) genotype call:
22/24 0/0
109/125 0/0
85/109 0/0
26/32 0/0
40/161 0/0
195/6 1/1
239/5 1/1
83/6 1/1
46/28 1/1

In one case, the two variants are adjacent to each other.
In some case, they are one base indels.

Thanks,

Jim

Post edited by Geraldine_VdAuwera on

Comments

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,818Administrator, GATK Dev admin

    Hi Jim,

    Can you give some more supporting info? For example, some IGV or pileup screenshots that show the reads and base calls at those loci? Without seeing more it's impossible to interpret the genotype numbers.

    Geraldine Van der Auwera, PhD

  • jimmikeselfjimmikeself Posts: 10Member

    Hi Geraldine,
    Thanks. Here is the genotype call and attached is the IGV snapshot

    chr10:124329672 T T/T
    chr10:124329673 C C/C

    I can email you a small slice of the BAM file, since it has some meta info, I do not want to post it here.

    Thank you,
    -Jim

    igv_wrongcall.png
    1680 x 885 - 27K
  • ebanksebanks Broad InstitutePosts: 684Member, Administrator, GATK Dev, Broadie, Moderator, DSDE Dev, GP Member admin

    What is your command-line for calling? Are you possibly running with the generalized ploidy model and asking to genotype a ploidy of 1?

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • jimmikeselfjimmikeself Posts: 10Member

    Eric,
    Thank you. I am using genotype_likelihoods_model, here is the cmd:
    java -Xmx4g -jar GATK
    -T UnifiedGenotyper -nt 8 -R HG19REF -I test.bam -o test.vcf
    -metrics OUTPUT/test.txt -dcov 250
    -L chr10 -glm BOTH -stand_call_conf 30 -stand_emit_conf 30
    -baq CALCULATE_AS_NECESSARY
    -G Standard -A AlleleBalance -A DepthOfCoverage
    -A HomopolymerRun -A QualByDepth

    -Jim

  • jimmikeselfjimmikeself Posts: 10Member

    The version is version 1.6-11-g3b2fab9 on Linux 64

  • ebanksebanks Broad InstitutePosts: 684Member, Administrator, GATK Dev, Broadie, Moderator, DSDE Dev, GP Member admin

    Hi Jim,

    I would try updating to the latest version of the code and turning off BAQ (which has been known to kill multi-nucleotide polymorphisms). If that doesn't work, it would be good to see what the HaplotypeCaller calls at this position.

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • jimmikeselfjimmikeself Posts: 10Member

    Eric,
    Thanks a lot! That's exactly the cause. Can you give me a clue about the effect on false positive calls if I turn it off? To my understanding (please correct me if I am wrong or incomplete), BAQ was introduced to reduce false positive calls around alignment induced indels. I wonder how much benefits it has.
    -Jim

  • ebanksebanks Broad InstitutePosts: 684Member, Administrator, GATK Dev, Broadie, Moderator, DSDE Dev, GP Member admin

    BAQ has tons of benefit for regular SNPs, but it tends to kill more complex events like MNPs (as you saw firsthand) and indels. This is precisely why we have developed the HaplotypeCaller as the successor to the UnifiedGenotyper.

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • jimmikeselfjimmikeself Posts: 10Member

    Eric,
    Thanks for the insights.
    So HaplotypeCaller kept the benefits and eliminated the drawbacks on complex events? Is 1000genome project using HaplotypeCaller?
    I'd like to know more about it. Could you point me to some docs or presentations about it, or the comparison of it and UnifiedGenotyper.
    What is the major difference?
    Thanks,
    -Jim

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,818Administrator, GATK Dev admin
  • jimmikeselfjimmikeself Posts: 10Member

    thanks!

Sign In or Register to comment.