The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.
Register now for the upcoming GATK Best Practices workshop, Feb 20-22 in Leuven, Belgium. Open to all comers! More info and signup at

CoverageBySample tool

redred Member Posts: 12
edited February 2013 in Ask the GATK team

I am also trying to check the coverage at each position of my reference using the CoverageBySample tool (with and without the –L argument):

java -Xmx30g -jar GenomeAnalysisTK.jar \
-T UnifiedGenotyper \
–T CoverageBySample \
–R ref.fasta  \
-I  input.bam  \
-o output.cov\

The output (below) is giving the right coverage but without the positions on the reference and also skipping all positions with no coverage. Is there any way to get these positions in the output file?

eo78       10
eo78       10
eo78       10
eo78       10
eo78       10
eo78       11
eo78       12
eo78       12
eo78       12
Post edited by Geraldine_VdAuwera on


  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,127 admin

    Do you not get ref position info when you use an intervals list?

    Anyway this is a fairly primitive tool -- you would be better off using DepthOfCoverage or DiagnoseTargets.

    Geraldine Van der Auwera, PhD

  • jamesrpriestjamesrpriest Member Posts: 11

    FYI; CoverageBySample does not appear to be compatible with reducereads. I have been using this tool to calculate read depths over my exome intervals,

    java -Xmx2g -jar /home/jpriest/priest_apps/GATKv2.3.9/GenomeAnalysisTK.jar \

    -T CoverageBySample \
    -I "$i" \
    -R /home/jpriest/reference_genomes/hg19/ucsc.hg19.fasta \
    -L /home/jpriest/reference_genomes/exome_intervals/ \
    -o "$i".basecoverage.txt

    awk {'print $2, $1}' "$i".basecoverage.txt | sort -n | awk 'BEGIN{c=0;sum=0;}{a[c++]=$1;sum+=$1;}END{ave=sum/c;if((c%2)==1){median=a[int(c/2)];}else{median=(a[c/2]+a[c/2-1])/2;}print sum," ",c," ",ave," ",median}' > "$i".coveragestats.txt

    However the mean read depth appears reduced after reduce reads is applied

    file 1 before RR 68.2454--> after RR 7.69979

    file 2 before RR 205.4--> after RR 6.95658

    Are there any suggestions on making accurate read depth calculations upon readreduced bam files?

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,127 admin

    Hi James,

    That's correct, this tool is not adapted to use reduced data. This is an older tool that is no longer updated and we plan to retire it soon, so you should switch to using DiagnoseTargets instead.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.