The current GATK version is 3.8-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?

1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?

Then follow instructions in Article#1894.

Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Download the latest Picard release at
GATK version 4.beta.3 (i.e. the third beta release) is out. See the GATK4 beta page for download and details.

PhaseByTransmission refuses to call a real de novo variant

vplagnolvplagnol Member
edited September 2013 in Ask the GATK team

(EDIT: solution found and explained below, mostly an error on my end, sorry)

I have what I know is a de novo variant (validated) and GATK PhaseByTransmission refuses to see it.
Here is what I am starting with in my VCF file:
7 151092903 . G A 338.83 PASS . GT:AD:DP:GQ:PL 0/0:12,0:12:36:0,36,414 0/0:20,0:20:60:0,60,669 0/1:6,15:20:99:389,0,108


  • the father is 12 ref, 0 alt
  • the mother is 20 ref, 0 alt
  • the offspring is 6 ref, 15 alt

When I run
java -Xmx2g -jar GenomeAnalysisTK-2.7-2-g6bda569/GenomeAnalysisTK.jar -R fasta/human_g1k_v37.fasta -T PhaseByTransmission --DeNovoPrior 0.00001 -V trio1_1553_1554_1555_small.recode.vcf -ped trio1_1553_1554_1555.ped -o trio1_1553_1554_1555.vcf --MendelianViolationsFile

I get the following output VCF line:
7 151092903 . G A 338.83 PASS . GT:AD:DP:GQ:PL:TP 1|0:12,0:12:0:0,36,414:13 0|0:20,0:20:60:0,60,669:13 1|0:6,15:20:99:389,0,108:13

So the father is eventually called a het.This happens even when I set the prior to a low value of 10^-5. That does not seem like the right behavior to me, a more appropriate call would be to call both parents ref homs. The genotype likelihood certainly suggest that for a 10^-5 prior of de novo event, this would make sense.

EDIT: OK, I wish I could remove this post. I don't think I can but I can edit the answer at least. I was just misreading the genotype likelihood. The evidence in favour of a homozygous call in the father is in fact weaker than I thought. A prior of de novo calls of 5x10^-4 fixes things, and with that threshold I am getting a proper de novo call at this location. I apologize for the pointless post!

Post edited by vplagnol on

Issue · Github
by Geraldine_VdAuwera

Issue Number
Last Updated

Best Answer


  • Thanks Laurent.

    I guess all I'd say (besides that I should look at the numbers more carefully before posting) is that the default prior is quite stringent.
    In a Bayesian way the numbers are properly calibrated, but most users in that case are likely to be more interested in the high sensitivity than the high specificity, i.e. be OK with some false positive as long as calls like this one (which is somewhat convincing) are picked up. I guess what threw me off is that I had already reduced the prior quite a bit and it still was not enough.

    This being said, parameters must be set and it's always arbitrary. I just wonder if a more relaxed default parameter wouldn't be more appropriate in this case.

    Thanks again for the support.


  • ebanksebanks Broad InstituteMember, Broadie, Dev

    Thanks for the feedback, Vincent. Ultimately the parameters are set based on empirical evidence in human data - but we'll be sure to update the docs accordingly.

Sign In or Register to comment.