We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

PhaseByTransmission refuses to call a real de novo variant

vplagnolvplagnol Member
edited September 2013 in Ask the GATK team

(EDIT: solution found and explained below, mostly an error on my end, sorry)

I have what I know is a de novo variant (validated) and GATK PhaseByTransmission refuses to see it.
Here is what I am starting with in my VCF file:
7 151092903 . G A 338.83 PASS . GT:AD:DP:GQ:PL 0/0:12,0:12:36:0,36,414 0/0:20,0:20:60:0,60,669 0/1:6,15:20:99:389,0,108


  • the father is 12 ref, 0 alt
  • the mother is 20 ref, 0 alt
  • the offspring is 6 ref, 15 alt

When I run
java -Xmx2g -jar GenomeAnalysisTK-2.7-2-g6bda569/GenomeAnalysisTK.jar -R fasta/human_g1k_v37.fasta -T PhaseByTransmission --DeNovoPrior 0.00001 -V trio1_1553_1554_1555_small.recode.vcf -ped trio1_1553_1554_1555.ped -o trio1_1553_1554_1555.vcf --MendelianViolationsFile trio1_1553_1554_1555_noMendel.tab

I get the following output VCF line:
7 151092903 . G A 338.83 PASS . GT:AD:DP:GQ:PL:TP 1|0:12,0:12:0:0,36,414:13 0|0:20,0:20:60:0,60,669:13 1|0:6,15:20:99:389,0,108:13

So the father is eventually called a het.This happens even when I set the prior to a low value of 10^-5. That does not seem like the right behavior to me, a more appropriate call would be to call both parents ref homs. The genotype likelihood certainly suggest that for a 10^-5 prior of de novo event, this would make sense.

EDIT: OK, I wish I could remove this post. I don't think I can but I can edit the answer at least. I was just misreading the genotype likelihood. The evidence in favour of a homozygous call in the father is in fact weaker than I thought. A prior of de novo calls of 5x10^-4 fixes things, and with that threshold I am getting a proper de novo call at this location. I apologize for the pointless post!

Post edited by vplagnol on

Issue · Github
by Geraldine_VdAuwera

Issue Number
Last Updated

Best Answer


  • Thanks Laurent.

    I guess all I'd say (besides that I should look at the numbers more carefully before posting) is that the default prior is quite stringent.
    In a Bayesian way the numbers are properly calibrated, but most users in that case are likely to be more interested in the high sensitivity than the high specificity, i.e. be OK with some false positive as long as calls like this one (which is somewhat convincing) are picked up. I guess what threw me off is that I had already reduced the prior quite a bit and it still was not enough.

    This being said, parameters must be set and it's always arbitrary. I just wonder if a more relaxed default parameter wouldn't be more appropriate in this case.

    Thanks again for the support.


  • ebanksebanks Broad InstituteMember, Broadie, Dev ✭✭✭✭

    Thanks for the feedback, Vincent. Ultimately the parameters are set based on empirical evidence in human data - but we'll be sure to update the docs accordingly.

Sign In or Register to comment.