ReduceReads is time consuming,

ugoodlfyugoodlfy Posts: 2Member
edited February 2013 in Ask the GATK team

HI all,

I am analyzing some whole genome sequencing datas .After preprocessing by Queue got a large bam file on sample level (~ 200GB/sample ) and I wanted to use ReaduceReads module to reduce the bam file size. and running following command: /usr/java/latest/bin/java -Xmx16g -jar /path_to_GenomeAnalysisTK-2.3-9/GenomeAnalysisTK.jar -R /path_to_human_g1k_v37.fasta -T ReduceReads -I /path_to_Queue/project.sample.clean.dedup.recal.bam -o sample.reduced.bam --generate_md5

After 8 hours , the estimated time goes to 6.9 days.

INFO 20:02:25,508 ProgressMeter - 1:120660726 5.63e+07 6.5 h 7.0 m 3.9% 7.0 d 6.7 d INFO 20:03:25,509 ProgressMeter - 1:120660726 5.63e+07 6.5 h 7.0 m 3.9% 7.0 d 6.7 d INFO 20:04:25,510 ProgressMeter - 1:120660726 5.63e+07 6.6 h 7.0 m 3.9% 7.0 d 6.8 d INFO 20:05:25,511 ProgressMeter - 1:120660726 5.63e+07 6.6 h 7.0 m 3.9% 7.0 d 6.8 d INFO 20:06:25,512 ProgressMeter - 1:120677835 5.63e+07 6.6 h 7.0 m 3.9% 7.1 d 6.8 d INFO 20:07:25,528 ProgressMeter - 1:120677835 5.63e+07 6.6 h 7.0 m 3.9% 7.1 d 6.8 d INFO 20:08:25,529 ProgressMeter - 1:120677835 5.63e+07 6.6 h 7.1 m 3.9% 7.1 d 6.8 d INFO 20:09:25,530 ProgressMeter - 1:120677835 5.63e+07 6.6 h 7.1 m 3.9% 7.1 d 6.8 d INFO 20:10:25,531 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.1 m 3.9% 7.1 d 6.9 d INFO 20:11:25,532 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.1 m 3.9% 7.2 d 6.9 d INFO 20:12:25,533 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.1 m 3.9% 7.2 d 6.9 d INFO 20:13:25,534 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.2 m 3.9% 7.2 d 6.9 d INFO 20:14:25,535 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.2 m 3.9% 7.2 d 6.9 d

The tool version is GenomeAnalysisTK-2.3-9

Is there anything wrong with my command ? How could I speed up this procedure? Thanks a lot .

Post edited by ugoodlfy on
Tagged:

Best Answer

Answers

Sign In or Register to comment.