removing low-quality reads by MuTect

Before variant calling, MuTect removes low-quality reads first, please look at short read preprocessing at nature.com/nbt/journal/v31/n3/extref/nbt.2514-S1.pdf. I want to use this short read pre-processing method for my BAM files, and tried to program by perl. But I have no idea about how to program these sentences: (c) if there is an overlapping read pair, and both reads agree the read with the highest quality score is retained otherwise both are discarded. (b) if there is an overlapping read pair, and both reads agree the read with the highest quality score is retained otherwise the read that disagrees with the reference is retained. Can anybody help me to understand them? Thanks very much for help!

--best
Jing

Tagged:

Best Answer

Answers

  • jingmengjingmeng AustraliaMember

    @Sheila said:
    @jingmeng
    Hi Jing,

    I hope this example will help you.

    Reference is ATGCATGCA
    ForwardRead is ATGCAT
    ReverseRead is CTTGCA

    We can see positions 4,5 and 6 overlap in the forward and reverse reads. Now, notice in position 4 both reads agree the base is a C. But, in position 5, the forward read shows the base is A but the reverse read shows the base is T.

    For your first case (c), at position 4, the read which has the higher quality C base will be used. At position 5, none of the reads will be used.

    For your second case (b), at position 4, the read which has the higher quality C base will be used. At position 5, the reverse read will be used because it mismatches the reference.

    I hope this helps.

    Sheila

    Hi Sheila,

    Thanks very much. Your reply is very clearly.

    Jing

Sign In or Register to comment.