# CoverageBySample tool

I am also trying to check the coverage at each position of my reference using the CoverageBySample tool (with and without the –L argument):

java -Xmx30g -jar GenomeAnalysisTK.jar \
-T UnifiedGenotyper \
–T CoverageBySample \
–R ref.fasta  \
-I  input.bam  \
-o output.cov\


The output (below) is giving the right coverage but without the positions on the reference and also skipping all positions with no coverage. Is there any way to get these positions in the output file?

eo78       10
eo78       10
eo78       10
eo78       10
eo78       10
eo78       11
eo78       12
eo78       12
eo78       12

Do you not get ref position info when you use an intervals list?

Anyway this is a fairly primitive tool -- you would be better off using DepthOfCoverage or DiagnoseTargets.

FYI; CoverageBySample does not appear to be compatible with reducereads. I have been using this tool to calculate read depths over my exome intervals,

java -Xmx2g -jar /home/jpriest/priest_apps/GATKv2.3.9/GenomeAnalysisTK.jar \ -T CoverageBySample \ -I "$i" \ -R /home/jpriest/reference_genomes/hg19/ucsc.hg19.fasta \ -L /home/jpriest/reference_genomes/exome_intervals/SeqCap_EZ_Exome_v2.samformat.new.intervals \ -o "$i".basecoverage.txt

awk {'print $2,$1}' "$i".basecoverage.txt | sort -n | awk 'BEGIN{c=0;sum=0;}{a[c++]=$1;sum+=$1;}END{ave=sum/c;if((c%2)==1){median=a[int(c/2)];}else{median=(a[c/2]+a[c/2-1])/2;}print sum," ",c," ",ave," ",median}' > "$i".coveragestats.txt

However the mean read depth appears reduced after reduce reads is applied

file 1 before RR 68.2454--> after RR 7.69979

file 2 before RR 205.4--> after RR 6.95658

Are there any suggestions on making accurate read depth calculations upon readreduced bam files?