Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Looking for picard per_target_coverage output field definitions

lracacholracacho OttawaMember

Hello, I ran Picard CollectHSmetrics with the optional per_target_coverage output and I am looking for the definitions of the following fields: "%GC", "mean_coverage", "normalized_coverage" (the definition for this field was already posted), "min_normalized coverage", "max_normalized coverage", "min_coverage", "max_coverage", "pct_0x" and "read_count".
Cheers

Tagged:

Best Answer

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    HI @lracacho

    Here is a link to the document with the metrics definitions you are looking for: https://broadinstitute.github.io/picard/picard-metric-definitions.html

  • lracacholracacho OttawaMember

    Thank you for the quick reply!

    I have previously looked at the picard-metric-definitions but I could not find the field definitions specifically for the "per_target_coverage" output. In addition to regular HSmetrics, per_target_coverage and per_base_coverage are output options.

    I was able to locate a similar field to "pct_0x". Under the heading for tool TargetedPcrMetrics, ZERO_CVG_TARGETS_PCT is defined as "The fraction of targets that did not reach coverage=1 over any base." Is this the same definition used for the per_target_coverage field "pct_0x"?

    Cheers

  • lracacholracacho OttawaMember

    Thank you! I have one more question....What is the relationship between max_coverage and read_count? For example, for one target region I have a read count of 192 and a min_coverage of 12 and a max_coverage of 32. This field read_count can't be the average across the target region.

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    The coverage is relevant to the "active region" as defined by the HaplotypeCaller.

    MAX_TARGET_COVERAGE The maximum coverage of reads that mapped to target regions of an experiment. This can be affected by two other parameters, MINIMUM_MAPPING_QUALITY (default=20) and MINIMUM_BASE_QUALITY (default=20). So reads with a low mapping quality and bases with a low quality are not considered for the target coverage calculations. Is your mapping quality lower for some samples in this region?

    Take a look at this conversation to see if that answers the question.

Sign In or Register to comment.