Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

Depth of Coverage - reported only first exon

MarcelaDMarcelaD Posts: 14Member
edited May 2013 in Ask the GATK team

Hi there,

this is my interval_list

chr1 762095 762275 LINC00115|NR_024321 chr1 762280 762414 LINC00115|NR_024321 chr1 762420 762565 LINC00115|NR_024321 chr1 777259 777349 LOC643837 chr1 777391 777481 LOC643837 chr1 777482 777642 LOC643837 chr1 783061 783151 LOC643837 chr1 792270 792446 LOC643837 chr1 861266 861496 NM_152486|SAMD11 chr1 865582 865787 NM_152486|SAMD11 chr1 866331 866507 NM_152486|SAMD11

and this is the output from the sample_interval_summary

chr1:762095-762275 ... chr1:762280-762414 ... chr1:762420-762565 ... chr1:777259-777349 ... chr1:783061-783151 ... chr1:792270-792446 ... chr1:861266-861496 ... chr1:865582-865787 ... chr1:866331-866507 ...

why am I missing two exons?

this is my cmd:

java -Xmx32g -jar /local/apps/gatk/2.5-2-gf57256b/GenomeAnalysisTK.jar -I sample.bam -R .../genome.fa -T DepthOfCoverage -o jtn -geneList hg19.tsv -L exons.list --omitDepthOutputAtEachBase --includeDeletions --interval_merging OVERLAPPING_ONLY -l INFO

Thanks for your input!

/M

Post edited by MarcelaD on

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,073Administrator, GATK Developer admin

    Hi Marcela,

    Please see the discussion here: http://gatkforums.broadinstitute.org/discussion/1831/depth-of-coverage-only-first-gene-summary-output

    I believe this is the same problem and can be solved the same way.

    Geraldine Van der Auwera, PhD

  • MarcelaDMarcelaD Posts: 14Member

    Thanks for your quick answer!

    The issue is that I do use a geneList file (hg19.tsv)

    -geneList

    666     LINC00115       chr1    +       762095  762565  762095  762565  3       762095,762280,762420,   762275,762414,762565,   0       |chr1:762095-762565|LINC00115|NR_024321|  cmpl    cmpl    0,0,0,
    666     LOC643837       chr1    +       777259  777642  777259  777642  3       777259,777391,777482,   777349,777481,777642,   0       |chr1:777259-777642|LOC643837|NR_015368|NR_047519|NR_047520|NR_047521|NR_047522|NR_047523|NR_047524|NR_047525|NR_047526|  cmpl    cmpl    0,0,0,
    666     LOC643837       chr1    +       783061  792446  783061  792446  2       783061,792270,  783151,792446,  0       |chr1:783061-792446|LOC643837|NR_015368|NR_047519|NR_047520|NR_047521|NR_047522|NR_047523|NR_047524|NR_047525|    cmpl    cmpl    0,0,
    666     NM_152486       chr1    +       861266  879593  861266  879593  13      861266,865582,866331,871064,874367,874612,876485,877519,877806,878173,878532,878657,879125,  861496,865787,866507,871262,874575,874816,876719,877733,878088,878465,878652,878777,879593,     0       |chr1:861266-879593|NM_152486|SAMD11|   cmpl    cmpl    0,0,0,0,0,0,0,0,0,0,0,0,0,

    So I do have a list of genes (hg19.tsv )and a list of exons or interval list (exons.list)

    And it only happens now and then, for instance for LINC00115 I do have the coverage at each exon

    Thanks!

  • CarneiroCarneiro Posts: 275Administrator, GATK Developer admin

    did this solve your problem? I'm afraid I didn't understand your answer.

  • MarcelaDMarcelaD Posts: 14Member

    Hi,

    sorry if I didn't explain my self, here I give it a try,

    This is my interval_list (-L) or exons:

    chr1 762095 762275 LINC00115|NR_024321 
    chr1 762280 762414 LINC00115|NR_024321 
    chr1 762420 762565 LINC00115|NR_024321 
    chr1 777259 777349 LOC643837 
    chr1 777391 777481 LOC643837 
    chr1 777482 777642 LOC643837 
    chr1 783061 783151 LOC643837 
    chr1 792270 792446 LOC643837 
    chr1 861266 861496 NM_152486|SAMD11 
    chr1 865582 865787 NM_152486|SAMD11 
    chr1 866331 866507 NM_152486|SAMD11

    Together with my -geneList (see above) I would expect 5 lines in the sample_interval_summary for LOC643837, but instead, I get 3, one for the first transcript (missing the last 2) and 2 for the second (correct output):

    chr1:762095-762275 ... 
    chr1:762280-762414 ... 
    chr1:762420-762565 ... 
    **chr1:777259-777349** ... 
    **chr1:783061-783151** ... 
    **chr1:792270-792446** ... 
    chr1:861266-861496 ... 
    chr1:865582-865787 ... 
    chr1:866331-866507 ..

    Why is that?

    Thanks again /M

Sign In or Register to comment.