If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office on October 14, 2019, due to the U.S. holiday. We will return to monitoring the forum on October 15.
Depth of Coverage: Only first gene summary output
I've been trying to run DOC on a sorted RefSeq Gene list downloaded from the UCSC Table Browser as described in the DOC guide doc. After sorting/formatting, only the first record in the ref seq gene list gets summarize in the foo.sample_gene_summary file. I've tried multiple one gene files and they all work individually, but when there is more than one record only the first record gets summarized. I'm using 2.2-8 (tried 2.0-25 as well) and I'm totally flabbergasted. Below is an example command line.
java -jar ./GenomeAnalysisTK-2.2-8-gec077cd/GenomeAnalysisTK.jar \ -T DepthOfCoverage \ -R human_g1k_v37_decoy.fasta \ -I in.bam \ -geneList:REFSEQ 2genes.txt \ -L 22:1615633-18714498 \ -mmq 1 \ --outputFormat table \ -o foo \ -l DEBUG
also attached is the 2genes.txt file (put it at the end as well in case there is problem with the attachment).
Below is a little bit from the logging when DEBUG is switched on (can add more if needed).
INFO 15:13:12,293 RMDTrackBuilder - Loading Tribble index from disk for file /isilon/sequencing/Kurt/Genes/2genes.txt DEBUG 15:13:12,364 DepthOfCoverage - Refseq init done. DEBUG 15:13:12,364 DepthOfCoverage - Examining 22:1615633-18714498 DEBUG 15:13:12,365 DepthOfCoverage - Annotation list is anonymous DEBUG 15:13:12,366 DepthOfCoverage - We do overlap 000 NM_001136213 22 - 16256331 16287937 16258185 16287885 11 16256331,16258184,16266928,16268136,16269872,16275206,16277747,16279194,16282144,16282477,16287253, 16256677,16258303,16267095,16268181,16269943,16275277,16277885,16279301,16282318,16282592,16287937, 0 POTEH cmpl cmpl -1,2,0,0,1,2,2,0,0,2,0, INFO 15:13:12,462 DepthOfCoverage - Printing summary info INFO 15:13:12,479 DepthOfCoverage - Printing locus summary INFO 15:13:12,772 ProgressMeter - done 1.71e+07 2.3 m 8.1 s 100.0% 2.3 m 0.0 s INFO 15:13:12,773 ProgressMeter - Total runtime 138.10 secs, 2.30 min, 0.04 hours INFO 15:13:12,928 MicroScheduler - 808 reads were filtered out during traversal out of 49775 total (1.62%) INFO 15:13:12,928 MicroScheduler - -> 607 reads (1.22% of total) failing DuplicateReadFilter INFO 15:13:12,928 MicroScheduler - -> 201 reads (0.40% of total) failing UnmappedReadFilter INFO 15:13:12,929 NSRuntimeProfile - Input time: 22.5 s (18.59%) INFO 15:13:12,929 NSRuntimeProfile - Map time: 36.2 s (29.93%) INFO 15:13:12,929 NSRuntimeProfile - Reduce time: 59.8 s (49.47%) INFO 15:13:12,929 NSRuntimeProfile - Outside time: 2.4 s ( 2.01%) DEBUG 15:13:12,931 GATKRunReport - Aggregating data for run report DEBUG 15:13:13,112 GATKRunReport - Posting report of type STANDARD 709 NM_001136213 22 - 16256331 16287937 16258185 16287885 11 16256331,16258184,16266928,16268136,16269872,16275206,16277747,16279194,16282144,16282477,16287253, 16256677,16258303,16267095,16268181,16269943, 16275277,16277885,16279301,16282318,16282592,16287937, 0 POTEH cmpl cmpl -1,2,0,0,1,2,2,0,0,2,0, 90 NM_018943 22 + 18593452 18614498 18593631 18613903 5 18593452,18604245,18606922,18609120,18613609, 18593634,18604468,18607071,18609801,18614498, 0 TUBA8 cmpl cmpl 0,0,1,0,0,
Is this just user error?