Holiday Notice:
The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!

Haplotype Caller

SystemSystem Administrator admin
This discussion was created from comments split from: New to the forum? Ask your questions here!.

Comments

  • lucas_rocmlucas_rocm AMNH, USAMember
    edited December 3

    [HARD FILTER QUESTION]

    Hi there,

    I am calling SNPs (Haplotype Caller GATK3) in a sample of 70 low-coverage (3-5x) genome of a non-model organism. When I plot the distribution of QD, I get a really odd distribution (see attached; please remove space from link) (h t t p s://us.v-cdn.net/5019796/uploads/editor/lj/cepi62d878al.png). I'm uncertain regarding which filter value I should choose (2 seems not stringent enough). Do you have some advice?
    P.S. this is prior to BQSR.

    Thanks for the help.

  • bhanuGandhambhanuGandham Member, Administrator, Broadie, Moderator admin

    Hi @lucas_rocm

    Because this is a low coverage data we do expect to see this kind of a distribution. The decision about choosing a filter value is more of a judgment call and also a trial and error method. You want to remove as many bad reads as possible while not losing out on data. You should try to make the QD value more stringent( maybe around 4) and see how many reads that filters out, and then make a call based on that.
    Without looking at the data myself that is unfortunately all i can suggest.

    Regards
    Bhanu

Sign In or Register to comment.