Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
DepthOfCoverage calculateCoverageOverGenes missing genes
I've been learning how to use the DepthOfCoverage to calculate coverage across genes. I noticed that some genes were covered by the bam, in the interval_list file, and in the geneList file, but were not reported in the gene_summary table. Most of these were non-coding RNAs, and in reviewing the geneList file, I noticed that for the non-coding RNAs, the Coding region start position is 1 base higher than the Coding region end position (which also put the Coding region start position higher than the Transcription end position). I adjusted the file and made the Coding region identical to the Transcription regions for the non-coding RNAs, and this resolved the issue for most of the genes. It appears that remaining genes that are still not reported in the gene_summary table, all overlap with exons or UTRs from other genes. My questions are:
- Is it the case that if two exons overlap, only one will be reported on, or is something else going on?
- What regions of the gene is the tool reporting on, the whole transcribed region, or just the coding region?
- Am I safe in changing the coding regions of non-coding RNAs to equal the transcribed region for the coverage analysis?