If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
mutect runtime performance poor
Hi, I'm trying to run MuTect against 10G input (tumor/normal) data files (10G each). The machine has 16G of memory and 4 CPUs :
NAG2:/big/mutect/data/large2/1234$ uname -a
Linux #24-Ubuntu SMP Fri Jan 7 18:30:50 UTC 2011 x86_64 GNU/Linux
However, MuTect is taking on average around 8 days to fully complete against these files. I'm relatively new to MuTect, so not sure if this is typical. From other posts on the forum, looks like others have experienced runtime in the hours (not days). Any suggestions on what I may be doing wrong or be able to change to improve this performance ?? I've also tried varying the -Xmx memory variable to different levels, but no real change in performance. Here is the specific run call I'm using :
/big/mutect/software/java/jre1.6.0_34/bin/java -Xmx12g -jar /big/mutect/software/muTect-1.1.4.jar -T MuTect -R /big/referenceGenome/b37/human_g1k_v37.
fasta -XL 2 --cosmic /big/referenceGenome/cosmic/b37_cosmic_v54_120711.vcf --dbsnp /big/referenceGenome/dbsnp/dbsnp_137.b37.vcf --input_file:normal /b
ig/mutect/data/large2/1234/control1234.bam --input_file:tumor /big/mutect/data/large2/1234/tumor1234.bam --out call_stats2014JAN13.out --coverage_file