Panel of Normals (PON)

A Panel of Normal or PON is a type of resource used in somatic variant analysis. Depending on the type of variant you're looking for, the PON will be generated differently. What all PONs have in common is that (1) they are made from normal samples (in this context, "normal" means derived from healthy tissue that is believed to not have any somatic alterations) and (2) their main purpose is to capture recurrent technical artifacts in order to improve the results of the variant calling analysis.

As a result, the most important selection criteria for choosing normals to include in any PON are the technical properties of how the data was generated. It's very important to use normals that are as technically similar as possible to the tumor (same exome or genome preparation methods, sequencing technology and so on). Additionally, the samples should come from subjects that were young and healthy to minimize the chance of using as normal a sample from someone who has an undiagnosed tumor. Normals are typically derived from blood samples.

There is no definitive rule for how many samples should be used to make a PON (even a small PON is better than no PON) but in practice we recommend aiming for a minimum of 40.

At the Broad Institute, we typically make a standard PON for a given version of the pipeline (corresponding to the combination of all protocols used in production to generate the sequence data, starting from sample preparation and including the analysis software) and use it to process all tumor samples that go through that version of the pipeline. Because we process many samples in the same way, we are able to make PONs composed of hundreds of samples.

Variant type-specific recommendations are given below.


Short variants (SNVs and indels)

For short variant discovery, the PON is created by running the variant caller Mutect2 individually on a set of normal samples and combining the resulting variant calls with some criteria (e.g. excluding any sites that are not present in at least 2 normals) as defined in the Best Practices documentation. This produces a sites-only VCF file that can be used as PON for Mutect2.


Copy Number Variants

For CNV discovery, the PON is created by running the initial coverage collection tools individually on a set of normal samples and combining the resulting copy ratio data using a dedicated PON creation tool. This produces a binary file that can be used as PON.

Comments

  • Hi,
    I have a question on generating a Panel of Normals for somatic variant detection from WGS data. My issue is as follows:

    1. I'm using the new 10x WGS Chromium sequencing strategy and which doesn't have much released data yet.

    2. As such, the only Normal sequencing runs I have are from my study, and I've read that you don't want to base the PoN data included in the study as you could bias your results.

    So Should I either:

    A. Use your 1000 genomes from a different chemistry
    B. Generate a PoN from my data and use it regardless of bias
    C. Generate a Panel of Normal custom for each sample, which leaves out the individual's normal sample from the samples used to create the PoN which will be applied to that sample
    D. Not use a PoN at all

    Issue · Github
    by shlee

    Issue Number
    2888
    State
    closed
    Last Updated
    Assignee
    Array
    Closed By
    sooheelee
  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin

    Hi @jgockley,

    I think you'll find what the paragraph about PoNs at the end of Section 2 of Article#11136 helpful:

    Ideally, the PoN includes samples that are technically representative of the tumor case sample--i.e. samples sequenced on the same platform using the same chemistry, e.g. exome capture kit, and analyzed using the same toolchain. However, even an unmatched PoN will be remarkably effective in filtering a large proportion of sequencing artifacts. This is because mapping artifacts and polymerase slippage errors occur for pretty much the same genomic loci for short read sequencing approaches.

  • mattimatti FinlandMember

    Hi,
    could you please elaborate, why the minN parameter of CombineVariants has been disabled from the CreateSomaticPanelOfNormals and/or how a user may in GATK4 control the minimum number of input files that must support a certain site

    Issue · Github
    by shlee

    Issue Number
    2996
    State
    closed
    Last Updated
    Assignee
    Array
    Closed By
    sooheelee
  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin

    Hi @matti,

    GATK4 CreateSomaticPanelOfNormals is a different tool than GATK3 CombineVariants whose sole purpose is to create a panel of normals for variant sites present in a minimum of two samples. The latter is still in the process of being ported over to GATK4. However, it sounds like you would like to be able to vary this number in CreateSomaticPanelOfNormals? If you can confirm, then I can ask our developers if they can implement such a feature.

  • mattimatti FinlandMember

    Hi @shlee, able to vary the minimum support level (i.e. files that support a certain site) would be of great importance for us.

    Issue · Github
    by shlee

    Issue Number
    4552
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    davidbenjamin
  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin
    edited March 22

    Hi Matti (@matti),

    I've put in a feature request on your behalf at https://github.com/broadinstitute/gatk/issues/4552. You can check the status of the request and add comments to it directly in the issue ticket. All you need is a Github account.

    I just realized you are the Matti I met in Helsinki. I hope the research is going well and that the GRCh38 version of MutSig is working well for you. Please send my regards to the workshop crew.

    Soo Hee

  • mattimatti FinlandMember
    edited March 22

    Hi Shlee (@shlee),
    yep, its me :smile: Our research goes well and we are super happy users of GATK and MuSig. Will forward your regards to the Eija and others.

Sign In or Register to comment.