Attention: Want an end-to-end pipelining solution for GATK Best Practices?

Check out Terra here! For more details on whether this is the right fit for you checkout our blogs here.

Calling Somatic Variants without matched normals using GATK.

Hi ,
I am new to the world of bioinformatics. I currently have sequencing data (WES) of about 45 pediatric brain tumor samples (archived FFPE), I am keen on identifying mutational burden and mutational signatures in these samples. I don't necessarily
want to discover a novel mutation and describe it's biological relevance. More use the pattern of mutational signatures to identify the causes of recurrence in tumors. The problem is like with most archived FFPE samples I don't have matched normal tissue. I am looking at the best approach to call somatic variants in these samples. Is Using gnomAD for filtering my best option? Is that a good resource for pediatric tumors? If no then what could be other potential sources for this.

Thank you for your advise.


  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    Hi Aditi,

    Have a look at this tutorial.


  • fi1d18fi1d18 Member
    Sorry help me please

    I have WGS .bam files for 3 patients (tumour and its matched derived model namely organoid) but I don't matched normal sample. If I call variants of each patients (tumour and its matched organoid), how I can use read counts at germline heterozygous positions estimated by GATK 3.2-2 to compensate for the absence of matched normal sample? I heard people use dbsnp VCF instead of the matched normal small variant VCF. But, I don't know start from where? which GATK function does that?
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @fi1d18

    As mentioned by Sheila, have a look at this tutorialL

    Section 2 outlines how to create the panel of normals resource using the tumor-only mode of Mutect2.

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    @fi1d18 You should either use the standard tumor-only mode, which has a command line of:

    gatk Mutect2 -R $ref -I tumor.bam -pon $pon --germline-resource $af-only-gnomad -O unfiltered.vcf
    gatk FilterMutectCalls -V unfiltered.vcf -O filtered.vcf

    where $af-only-gnomad can be obtained from the GATK resource bundle and $pon should be there as well (if it's not, we need to upload it!). It is very important not to try some custom germline filtering such as using dbSNP as the matched normal because only Mutect2's built-in --germline-resource argument applies a principled statistical model.

    If you're using the matched organoid your command line would be

    gatk Mutect2 -R $ref -I tumor.bam -I organoid.bam -normal organoid_sample \
        -pon $pon --germline-resource $af-only-gnomad -O unfiltered.vcf
    gatk FilterMutectCalls -V unfiltered.vcf -O filtered.vcf

    Finally, since this is FFPE you will need to run CollectF1R2Counts and LearnReadOrientationModel and feed that into Mutect2. This is described in our documentation: We're aware that this is cumbersome and we will streamline it in the 4.1.1 or 4.1.2 release. You can run the featured workflow on Firecloud / Terra to make this all much easier.

Sign In or Register to comment.