Hi,

I am learning to use the DepthofCoverage function to obtain the gene coverage information for a collection of bacterial contigs that were mapped with metagenomic reads. The original post introducing this function is here: http://gatkforums.broadinstitute.org/discussion/40/depthofcoverage-v3-0-how-much-data-do-i-have#latest

In the post, you mentioned the gene list, as follow:

-geneList /path/to/gene/list.txt

The provided gene list must be of the following format:

585     NM_001005484    chr1    +       58953   59871   58953   59871   1       58953,  59871,  0       OR4F5   cmpl    cmpl    0,
587     NM_001005224    chr1    +       357521  358460  357521  358460  1       357521, 358460, 0       OR4F3   cmpl    cmpl    0,
`

I have three inquiries:

1. Can you please provide headers to the values in each column?
2. I am working with bacterial genomic contigs, can you please specify what basic information is needed for a gene list (e.g., name of contig, name of gene, location of gene in the contig, from... to ..., etc.)?

Thanks so much!

Leo

Geraldine Van der Auwera, PhD

Hi Geraldine,
Thanks. I am working on a genome assembled from metagenomes. So, I do not have refseq for this "genome". I have contigs and coding regions predicted from the contigs. I mapped the metegenomic raw reads to the contigs and would like to get coverage for all genes. I would have to generate a custom gene list. Thanks!
Leo

Hi lkchan, did you reach the solution for your problem? because I am trying to do exactly this right now, I need to construct my custom genelist based on gff3/bed file, but I just can't find the headers! It is a .tsv file, so I could easily construct this file, right? but this example above doesn't tell us what is what... Thank you!