We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Filter by TLOD only in Mutect2

cmartinezruizcmartinezruiz United KingdomMember
Hello
I am using Mutect2 in GATK v4.1.4.0 to look for somatic variants in several tumor samples with matched germline. Because of the nature of the samples, I know I can trust variants with relatively low VAF, so I wanted to relax the filtering to allow tumor variants with an LOD similar to that of germline variants (~ 2.2). In previous versions of Mutect2 I would have simply set --tlod at 2.2 during the filtering step. The newest versions of Mutect2, however, does not have this option anymore and relies instead on a beta score (--f-score-beta) to tighten or relax the false discovery rate during the filtering step.
The issue with the beta score is that if I relax the filtering to a point where variants with TLOD >= 2.2 pass the filter, I end up with many variants with very low values in other fields (e.g. STRANDQ=1).
I could relax the filter to allow TLOD >= 2.2 and then filter again manually the resulting VCF to remove variants with low values in other fields, but this seems a rather convoluted way of approaching this issue and it feels like there should be a better way to do it.
In short, is there a way in the latest Mutect2 versions to allow for variants with low TLOD to pass the filtering step without relaxing all filters in the other fields?
Thank you!

Best Answer

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi ,

    The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal/erroneous results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.

    Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.

    We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.

    For context, see this [announcement](https://software.broadinstitute.org/gatk/blog?id=24419 “announcement”) and check out our [support policy](https://gatkforums.broadinstitute.org/gatk/discussion/24417/what-types-of-questions-will-the-gatk-frontline-team-answer/p1?new=1 “support policy”).

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    @cmartinezruiz You could try setting -log-snv-prior and -log-indel-prior to higher (less negative) values than their defaults of -13.8, but this strikes me as sketchy. A TLOD of 2.2 means that the likelihood of somatic variation is only 100 times that of sequencing error, and thus such variants are only really believable if you have an overwhelmingly high rate of somatic variation -- at least one in 100 sites.

    I have to wonder why these variants are compelling if their TLOD is so low. It is possible for low-AF variants to have a high TLOD in the case of high depth or high-quality reads, but if you have neither what is there to distinguish these from sequencing error?

  • cmartinezruizcmartinezruiz United KingdomMember
    Thaks @davidben , I think I had misunderstood what TLOD was doing then. Is LOD set to 2.2 for the germline variants because we expect a high proportion of those, then?

    I am trying to run Mutect2 on healthy tissue to detect somatic variants. To be clear, I am looking for somatic variants present only on the focal tissue, so essentially, I am running Mutect2 using blood DNA as a normal and the DNA from the focal healthy tissue as "tumor". Both blood and focal tissue were sequenced at an average coverage of 400x. I expect to find relatively few variants at low AF.

    I assumed that because Mutect2 has been designed for tumor samples, the filtering would be very stringent to account for the noisiness of cancer data. Because I expect healthy tissue to be more homogeneous than cancer tissue, I was looking at a way to relax those filters. I assumed that TLOD would be the variable to look at, but it looks like I was mistaken.

    What would be the best approach in this case, then? Is the default filtering in Mutect2 not too stringent for detecting variants in healthy tissue?

    Thanks again!
  • cmartinezruizcmartinezruiz United KingdomMember
    @davidben yes I see now how that makes sense. I had understood the filtering step in the complete opposite way. Thank you very much!
Sign In or Register to comment.