Attention:
The front line support team will be unavailable to answer questions until May 27th 2019 as we are celebrating Memorial Day. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

Depth of Coverage - Only one type of output

tinutinu Member
edited April 2014 in Ask the GATK team

I am using DepthOfCoverage for the first time

I used the following command to get the coverage and I am getting only one output file, which gives coverage per samples. I am giving gene list in REFSEQ format along with a sorted interval BED file. However I am not getting any file with per gene coverage.

Could anyone clarify whether I am missing something in the arguments

java -Xmx8g -jar GenomeAnalysisTK.jar -R Homo_sapiens_assembly19.fasta -T DepthOfCoverage --calculateCoverageOverGenes:REFSEQ Genes_refgene.txt --outputFormat table -o Coverage_summary -I BAM.list -L genes.bed

Thanks,
Tinu

Tagged:

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Tinu,

    Have a look at the tool documentation for DepthOfCoverage; there is a separate argument for the list of genes that is distinct from the -L intervals list argument.

  • tinutinu Member

    Hi Geraldine,

    Looking at the Depth of Coverage documentation and the link here http://www.broadinstitute.org/gatk/guide/article?id=40

    I used -geneList or --calculateCoverageOverGenes to find out coverage per gene along with -L to specify interval

    I have pasted the contents of the files Genes_refgene.txt and genes.bed below. Still not clear why I am not getting any output file with per gene coverage.

    java -Xmx8g -jar GenomeAnalysisTK.jar -R Homo_sapiens_assembly19.fasta -T DepthOfCoverage --calculateCoverageOverGenes:REFSEQ Genes_refgene.txt --outputFormat table -o Coverage_summary -I BAM.list -L genes.bed

    head Genes_refgene.txt
    
    791     NM_139135       1       +       27022521        27108601        27022894        27107247        20      "27022521,27056141,27057642,27059166,27087346,27087874,27088642,27089463,27092711,27092947,27094280,27097609,27098990,27099302,27099836,27100070,27100292,27101470,27102067,27105513,"       "27024031,27056354,27058095,27059283,27087587,27087964,27088810,27089776,27092857,27093057,27094490,27097817,27099123,27099478,27099987,27100208,27100389,27101711,27102198,27108601,"    0       ARID1A  cmpl    cmpl    "0,0,0,0,0,1,1,1,2,1,0,0,1,2,1,2,2,0,1,0,"
    791     NM_006015       1       +       27022521        27108601        27022894        27107247        20      "27022521,27056141,27057642,27059166,27087346,27087874,27088642,27089463,27092711,27092947,27094280,27097609,27098990,27099302,27099836,27100070,27100292,27100819,27102067,27105513,"       "27024031,27056354,27058095,27059283,27087587,27087964,27088810,27089776,27092857,27093057,27094490,27097817,27099123,27099478,27099987,27100208,27100389,27101711,27102198,27108601,"    0       ARID1A  cmpl    cmpl    "0,0,0,0,0,1,1,1,2,1,0,0,1,2,1,2,2,0,1,0,"
    1464    NM_002524       1       -       115247084       115259515       115251155       115258781       7       "115247084,115250774,115251151,115252189,115256420,115258670,115259278,"     "115250671,115250813,115251275,115252349,115256599,115258798,115259515,"        0       NRAS    cmpl    cmpl    "-1,-1,0,2,0,0,-1,"
    
    
    head genes.bed
    1       27022533        27022634        ARID1A
    1       27022883        27022964        ARID1A
    1       27023112        27023498        ARID1A
    1       27023499        27023788        ARID1A
    1       27056114        27056354        ARID1A
    1       27057616        27058125        ARID1A
    

    Thanks,
    Tinu

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Ah, sorry, I didn't read your command line properly the first time. You were indeed doing the right thing already.

    I'm not sure what's happening here. Could you post the full log and the list of files produced?

  • tinutinu Member

    I did not get any log files and got only one file which is' Coverage_summary' as output file which has per base coverage for all samples

  • tinutinu Member

    I am using GATK 3.0

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hey @tinu, sorry I forgot to get back to you yesterday.

    Can you post the entire log output? I want to see if there's a warning in there about whether your gene list was loaded properly or not.

  • tinutinu Member

    Hi Geraldine,

    Ok I realize that the problem was with the refseq file. Basically it was not sorted properly.

    It was sorted chromosome wise , but not per position in the genome. Now I am getting the following output files

    Coverage_summary.sample_cumulative_coverage_counts

    Coverage_summary.sample_cumulative_coverage_proportions

    Coverage_summary..sample_gene_summary

    Coverage_summary..sample_interval_statistics

    Coverage_summary.sample_interval_summary

    Coverage_summary..sample_statistics

    Coverage_summary..sample_summary

    Thank you for the support and for following up on this.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Oh good, I'm glad it's resolved. Thanks for reporting your solution, I'll keep this possibility in mind if anyone else has trouble with the refseq file.

Sign In or Register to comment.