Panel of Normal and Match-Normal relation

Hello GATK team ! (again)

I am now doing my pipeline again with more samples, coming from the GDC data portal. So this time I will have a matched-normal for every samples. I am also considering to build a panel of normals.

My question is : Is there any bias that could arise from the fact that the match-normal used for variant calling is also present in the PoN ? It would be of course much more convenient to build only once the PoN and use the same for every variant call...

Another question : how is the filtering step with the PoN achieved ? Are every variants in the PoN filtered out ? (so, by default, present in at least 2 samples used to make the PoN) I mean, is it just a hard-filter or is there any statistical approach behind ?

Thank you very much in advance ! Regards,

Alexandre Coudray

by Sheila

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    Hi Alexandre,

    I am just confirming with the team. I will get back to you soon.


  • ac67479ac67479 AustinMember
  • ac67479ac67479 AustinMember

    Just to summarize (maybe it was not very clear) :

    I have 10 Glioblastoma samples with 10 matched-normal. It would be very convenient to use these 10 matched-normal to build the PoN ! (instead of building 10 different PoN or to get other data for the PoN) I can also imagine that with more data it would be even more convenient. But I assume there would be a little bias because the matched-normal is in the PoN, etc.. Maybe there is parameters that could be adjusted in the Variant calling or while generating the PoN ?

    Thank you very much !


  • shleeshlee CambridgeMember, Broadie, Moderator admin
    edited December 2016

    Hi @ac67479,

    Please check out the thread at The discussion is about a Mutect PON and has some tidbits of information and links that are relevant to your question.

