Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

DepthOfCoverage for whole genome: skipped few regions

NeelamNeelam BlacksburgMember

I used following command to get coverage data for entire genome:
java -Xmx2g -jar ~/GenomeAnalysisTK-2.5-2-gf57256b/GenomeAnalysisTK.jar -T DepthOfCoverage -R ~/Gmax_189.fa -o DoCov_dedup -I dedup.bam.list

Coverage was not reported for 36025109 bases. Many skipped bases fall one after another in the genome to form big continuous regions. I do not understand, why these regions were skipped. Usually if there was no mapping, the coverage value should be reported as 0.

Appreciate some insight into this issue! Thank you.

Tagged:

Best Answers

Answers

  • NeelamNeelam BlacksburgMember

    Thank you. Those are all strings of ambiguous bases. What will be the situation for single ambiguous base in middle of canonical bases?

  • NeelamNeelam BlacksburgMember

    I found DepthOfCoverage is reporting for ambiguous bases in coverage output file.
    Attached are the snapshots of: (1) coverage output for small region on chromosome 1;
    image

    and (2) alignments output in this region.
    image

    Position #21956 is starting of ambiguous region that is more than 8K long in reference genome.
    How can I get rid of these points from depth output file, without actually eliminating the regions of no mapping?

Sign In or Register to comment.