Trying to identify low coverage regions (bases) in my sample from a target BED and BAM file.

I have a bed of my targets (start and stop coordinates of each exon of my genes of interest). I have a BAM file generated from my ION PGM run. I am trying to identify any location identified in my BED file where I have less than 20x coverage so I can fill in these regions with traditional Sanger sequencing.

Basically, if any bases within the exon are covered at less than 20x, I want to know which.

Also, if I add a descriptive section to the BED file contain “gene-exon” that could be included in the report it would be even more helpful. I have been reading post online for days, and I am lost here. Can anybody help please?





  KurtKurt Member

    I'm not sure if there is any one tool that does exactly what you want. I have had to do something similar for a project, but with Illumina data. I start off with DepthOfCoverage in GATK to generate a per base depth report and merge in the annotation from the source bed file.

    There are multiple possible outputs, but one of them is a per base depth for everything represented in the bed file.

  Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    DepthOfCoverage is a good start; you can also look at DiagnoseTargets to narrow down your list of intervals first.

