Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

fcuisine

Hello,
I see that old documentation for GATK 1.6 is no longer available (http://gatkforums.broadinstitute.org/discussion/1615/old-doc). I was hoping to find out how the UnifiedGenotyper handled zero-quality mappings. The most recent documentation for 2.x says there is are
"internal quality control metrics (like MAPQ > 17, for example)"

I did not see a command line argument for a min. mapping qual when I ran UnifiedGenotyper (GATK v1.6-7-g2be5704), and I wanted to confirm that the UnifiedGenotyper in version v1.6-7-g2be5704 is ignoring bases from reads with zero mapping qualities for the genotype calculations, (and, perhaps those with mapping quality <= 17, as well). I confess I'm also unsure if zero mapping quality bases are included in the vcf DP field for INFO and/or FORMAT.

Sorry for the extensive question, and thanks for the forum, and for any help you could give.

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi there,

    The UG will indeed ignore bases from reads with zero map quality. This is implemented using read filters; each GATK tool including the UG uses certain read filters by default (which are listed in the console output when you run the tool) which filter out things like duplicate reads, unmapped reads and so on. You can further control which reads are seen by the tool by adding or removing read filters, or modifying the threshold values where applicable. In your case, you may be interested to use the mapping quality read filter:

    http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_filters_MappingQualityFilter.html

  • fcuisinefcuisine Member

    Geraldine_VdAuwera, thank you very much for your help.

    Though I think that answers my question, I should have clarified that I was looking at vcf's from UG runs i ran some time ago, using GATK 1.6. Looking at the console output (stdout was written to file) I don't see a list of filters with defaults. Near the end of output, there were given counts for reads "failing BadMateFilter", and reads "failing UnmappedReadFilter." I also checked the vcf file header, but did not see the filters there, either.

    Sorry to trouble you further, but hoped i might just confirm (in case I misunderstood your answer) that, in those "old" UG (GATK v1.6) runs already done, read filters by default kept bases from zero map qual. reads out of the genotyping (and the DP counts, too, I assume). Another item that made me want to confirm was the inclusion in the vcf INFO fields of MQ0 counts (a useful number), which I assume do not imply use of zero map qual bases in genotyping.

    I also see that the info at the link you provided notes that the MappingQualityFilter has a default threshold of 10. I wonder if you could let me know if that holds for the older version I used, GATK v1.6-7-g2be5704, or if a different value was in use.

    Again, thanks very much, and apologies if my questions are too numerous for the forum format.

  • fcuisinefcuisine Member

    Thanks again, your help is much appreciated.

Sign In or Register to comment.