I am learning to use the DepthofCoverage function to obtain the gene coverage information for a collection of bacterial contigs that were mapped with metagenomic reads. The original post introducing this function is here:

In the post, you mentioned the gene list, as follow:

-geneList /path/to/gene/list.txt

The provided gene list must be of the following format:

585     NM_001005484    chr1    +       58953   59871   58953   59871   1       58953,  59871,  0       OR4F5   cmpl    cmpl    0,
587     NM_001005224    chr1    +       357521  358460  357521  358460  1       357521, 358460, 0       OR4F3   cmpl    cmpl    0,

I have three inquiries:

  1. Can you please provide headers to the values in each column?
  2. I am working with bacterial genomic contigs, can you please specify what basic information is needed for a gene list (e.g., name of contig, name of gene, location of gene in the contig, from... to ..., etc.)?

Thanks so much!


Best Answer


  Geraldine_VdAuwera
    This article explains how to work with refseq gene lists:

    Geraldine Van der Auwera, PhD

  lkchan

    Hi Geraldine,
    Thanks. I am working on a genome assembled from metagenomes. So, I do not have refseq for this "genome". I have contigs and coding regions predicted from the contigs. I mapped the metegenomic raw reads to the contigs and would like to get coverage for all genes. I would have to generate a custom gene list. Thanks!

  marlaux

    Hi lkchan, did you reach the solution for your problem? because I am trying to do exactly this right now, I need to construct my custom genelist based on gff3/bed file, but I just can't find the headers! It is a .tsv file, so I could easily construct this file, right? but this example above doesn't tell us what is what... Thank you!

