disable strand bias and clustered position filters, mutect

Liz10683Liz10683 United StatesMember

Hi, I would like to analyze a dataset consisting of RADseq (Restriction-site Associated DNA) tags from tumor and normal samples. With Radseq, the reads start at restriction enzyme cut sites in the genome - therefore the assumptions that mutations will be covered by reads from both directions and staggered with respect to position in the read are violated. Is there a way to override the strand bias and clustered position filters in the MuTect pipeline?

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi Liz,

    Sorry for the very late reply. Generally speaking, the various filters used internally by MuTect can be disabled by setting their value to something that ends up having no effect. The difficulty is that the disabling value can be very different for different filters, and we have not yet documented them. For example, the strand bias filter can be disabled using --strand_artifact_lod -99999, but the same value won't work for the clustered position filter. We plan on improving this in future versions, but for now we recommend using text processing on the call stats output file to re-filter the variant calls. For example, in this case you would write a script that assigns the KEEP decision to any variant lines that have only clustered_read_position or strand_artifact in the failure_reasons field.

  • perryeperrye New Haven, CTMember

    The output file that I generate with MuTect doesnt seem to have a failure_reasons field. Where can I find this information? My output has the following columns:
    contig position context ref_allele alt_allele tumor_name normal_name
    score dbsnp_site covered power tumor_power normal_power
    total_pairs improper_pairs map_Q0_reads t_lod_fstar tumor_f
    contaminant_fraction contaminant_lod
    t_ref_count t_alt_count t_ref_sum t_alt_sum t_ref_max_mapq t_alt_max_mapq t_ins_count t_del_count
    normal_best_gt init_n_lod n_ref_count n_alt_count n_ref_sum n_alt_sum judgement

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hey @perrye, we'll answer you tomorrow in the other discussion where you commented. In future, please try not to double-post closely related questions. It makes it harder for us to manage the forum and make sure everyone gets the answers they need. We answer everything eventually -- but sometimes we need to look things up so you need to be a little patient.

  • Hi !
    I have the same issue using somatic data from amplicon sequencing and should not be filtering on these 2 filters. I have been installing the new version (4.5) with MuTect2.
    Does this new version helps with enabling/disabling some filters ? I have tried to look for some parameters doing so but could not find any.

    Thank you for your answer.
    Manon

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    I assume you mean GATK 3.5, unless you have access to a time machine :)

    No, at this time it is not possible to disable MuTect2's internal filters; but note that the filters have changed somewhat. Have a look at this presentation for an overview of what has changed.

  • manon_sourdeixmanon_sourdeix FranceMember

    Just saw that you answered this thread ! I meant GATK 3.5 :)

    Thank you for the presentation, I will have a read !

  • @Geraldine_VdAuwera said: Generally speaking, the various filters used internally by MuTect can be disabled by setting their value to something that ends up having no effect. The difficulty is that the disabling value can be very different for different filters, and we have not yet documented them. For example, the strand bias filter can be disabled using --strand_artifact_lod -99999, but the same value won't work for the clustered position filter. We plan on improving this in future versions, but for now we recommend using text processing on the call stats output file to re-filter the variant calls. For example, in this case you would write a script that assigns the KEEP decision to any variant lines that have only clustered_read_position or strand_artifact in the failure_reasons field.

    Hi Geraldine,
    Sorry for repeating the same question but maybe there were some changes since this was originally posted.
    I am not sure which Mutect version @Liz10683 has used and I am currently using Mutect-1.1.7 (the latest Mutect1 version I found).
    And the filters I am trying to disable/loosen are also those related to clustered positions ("--pir_median_threshold " and "--pir_mad_threshold").
    Thank you for your help!
    Eugenie

Sign In or Register to comment.