To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

DepthOfCoverage for whole genome: skipped few regions

NeelamNeelam BlacksburgMember

I used following command to get coverage data for entire genome:
java -Xmx2g -jar ~/GenomeAnalysisTK-2.5-2-gf57256b/GenomeAnalysisTK.jar -T DepthOfCoverage -R ~/Gmax_189.fa -o DoCov_dedup -I dedup.bam.list

Coverage was not reported for 36025109 bases. Many skipped bases fall one after another in the genome to form big continuous regions. I do not understand, why these regions were skipped. Usually if there was no mapping, the coverage value should be reported as 0.

Appreciate some insight into this issue! Thank you.

Tagged:

Best Answers

Answers

  • NeelamNeelam BlacksburgMember

    Thank you. Those are all strings of ambiguous bases. What will be the situation for single ambiguous base in middle of canonical bases?

  • NeelamNeelam BlacksburgMember

    I found DepthOfCoverage is reporting for ambiguous bases in coverage output file.
    Attached are the snapshots of: (1) coverage output for small region on chromosome 1;
    image

    and (2) alignments output in this region.
    image

    Position #21956 is starting of ambiguous region that is more than 8K long in reference genome.
    How can I get rid of these points from depth output file, without actually eliminating the regions of no mapping?

Sign In or Register to comment.