Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!

Estimating error/mismatch rate by Mutect2

amjaddamjadd FinlandMember ✭✭

So finally Mutect2 calls variants from multiple tumors, and it's great (at least the SNVs). Now assume we have tumorA and tumorB (from the same individual), and Mutect2 calls a variant in tumorA (high allele fraction), but tumor 2 has only 1-2 reads supporting the variant allele. To be able to decide whether the variant in tumorB exists, we need to estimate the mismatch/error rate, and compare the 1-2 reads seen in tumorB with the expected error rate. We can in general estimate the mismatch rate using Picard CollectAlignmentSummaryMetrics, but the numbers don't seem to reflect the real error rate emitted by Mutect2, likely because Mutect2 applies its own read filters beforehand.

Is there a tool/trick to get the background error rate under Mutect2 filters?

Tagged:

Best Answer

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited July 11

    Hi @amjadd

    Can you please post the variant record you are referring to, the version of Mutect2 you are using and the entire command. Also please post the mismatch rate using Picard CollectAlignmentSummaryMetrics and the real error rate emitted by Mutect2, to help answer this question.

  • amjaddamjadd FinlandMember ✭✭

    Hi @bhanuGandham
    Thanks for answering. Does Mutect2 emit any error rate? If it does, then that's what I am looking for. Please tell me where I can find it.
    I am using GATK v4.1.2 and the error by CollectAlignmentSummaryMetrics is ~0.0025.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited July 11

    Hi @amjadd

    I am checking with the dev team about that, in the meantime would you please elaborate on what you mean by

    but the numbers don't seem to reflect the real error rate emitted by Mutect2

    What real error rate are you comparing to?

  • amjaddamjadd FinlandMember ✭✭

    @bhanuGandham Ah now I get it. Basically I have multi-tumors, so by looking at the mutations that do not exist in a certain tumor sample, the mismatches in the calls seem to be at lower rate than what Picard has reported.

  • amjaddamjadd FinlandMember ✭✭

    To give an example, you would expect given an error rate of 0.0025 to have ~6 mismatches in 3000 reads, and 6/3 = 2 reads supporting the variant allele. What I see is mostly zeros in many calls.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    @amjadd

    I checked with the developer, Mutect2 does not emit any error rates like error rates generated by CollectAlignmentSummaryMetrics, however it does emit information about read filtering in the stdout.

  • amjaddamjadd FinlandMember ✭✭

    So there is currently no way to get that number e.g. make a new bam with all the Mutect read filters, and then run CollectAlignmentSummaryMetrics on it?

  • amjaddamjadd FinlandMember ✭✭

    Thank you for your answer @bhanuGandham

    Now the real question is how to produce a bam file with Mutect read filters?

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    @amjadd

    Can you please explain what you mean by "make a new bam with all the Mutect read filters". I am not sure I understand the requirement for this.

  • amjaddamjadd FinlandMember ✭✭

    @bhanuGandham said:
    @amjadd

    Can you please explain what you mean by "make a new bam with all the Mutect read filters". I am not sure I understand the requirement for this.

    @bhanuGandham Sorry for the late reply. I meant the read filters applied by Mutect before calling variants and reporting number of reads.

    From Mutect2 page:

    Read filters
    These Read Filters are automatically applied to the data by the Engine before processing by MuTect2.
    
        MalformedReadFilter
        BadCigarFilter
        UnmappedReadFilter
        NotPrimaryAlignmentFilter
        FailsVendorQualityCheckFilter
        DuplicateReadFilter
        MappingQualityUnavailableFilter
    
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @amjadd

    As I mentioned earlier, I don't think there is a way to do that with Mutect2. Take a look at these tools to see if any of these tools can help you with the metrics you are looking for: https://software.broadinstitute.org/gatk/documentation/tooldocs/current/#DiagnosticsandQualityControl

Sign In or Register to comment.