CoverageBySample tool

redred Posts: 12Member
edited February 2013 in Ask the GATK team

I am also trying to check the coverage at each position of my reference using the CoverageBySample tool (with and without the –L argument):

java -Xmx30g -jar GenomeAnalysisTK.jar \
-T UnifiedGenotyper \
–T CoverageBySample \
–R ref.fasta  \
-I  input.bam  \
-o output.cov\

The output (below) is giving the right coverage but without the positions on the reference and also skipping all positions with no coverage. Is there any way to get these positions in the output file?

eo78       10
eo78       10
eo78       10
eo78       10
eo78       10
eo78       11
eo78       12
eo78       12
eo78       12
Post edited by Geraldine_VdAuwera on

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,822Administrator, GATK Developer admin

    Do you not get ref position info when you use an intervals list?

    Anyway this is a fairly primitive tool -- you would be better off using DepthOfCoverage or DiagnoseTargets.

    Geraldine Van der Auwera, PhD

  • jamesrpriestjamesrpriest Posts: 11Member

    FYI; CoverageBySample does not appear to be compatible with reducereads. I have been using this tool to calculate read depths over my exome intervals,

    java -Xmx2g -jar /home/jpriest/priest_apps/GATKv2.3.9/GenomeAnalysisTK.jar \ -T CoverageBySample \ -I "$i" \ -R /home/jpriest/reference_genomes/hg19/ucsc.hg19.fasta \ -L /home/jpriest/reference_genomes/exome_intervals/SeqCap_EZ_Exome_v2.samformat.new.intervals \ -o "$i".basecoverage.txt

    awk {'print $2, $1}' "$i".basecoverage.txt | sort -n | awk 'BEGIN{c=0;sum=0;}{a[c++]=$1;sum+=$1;}END{ave=sum/c;if((c%2)==1){median=a[int(c/2)];}else{median=(a[c/2]+a[c/2-1])/2;}print sum," ",c," ",ave," ",median}' > "$i".coveragestats.txt

    However the mean read depth appears reduced after reduce reads is applied

    file 1 before RR 68.2454--> after RR 7.69979

    file 2 before RR 205.4--> after RR 6.95658

    Are there any suggestions on making accurate read depth calculations upon readreduced bam files?

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,822Administrator, GATK Developer admin

    Hi James,

    That's correct, this tool is not adapted to use reduced data. This is an older tool that is no longer updated and we plan to retire it soon, so you should switch to using DiagnoseTargets instead.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.