If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
SVPreprocess: memory issue and -jobNative options
I'm trying to process 150 WGS 30X BAMs in one batch. 5 samples failed due to "TERM_MEMLIMIT: job killed after reaching LSF memory usage limit." I have some questions.
1) In one thread you said that the pipeline internally sets the heap sizes for various java processes to good default values. Is there any way to set -Xmx parameter for the downstream processes (not for the main script) manually? In the case of failed samples it was automatically set to 2 GB.
"FunctionEdge - Error: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' ..."
2) I need all the jobs to be submitted to the specific nodes.
I specified -jobNative "-m \"node1 node2 node3\"" but it didn't work (without backslashes too). What is the correct way to pass such arguments? And how to pass memory usage argument like -R "rusage[mem=16000]"?
3) In general, what is the most effective way to preprocess 150 samples? Looks like the whole preprocess step failed only because of these 5 failed samples. Is it possible not to repeat the procedure from the scratch but somehow use the intermediate results?