Service Notice: Due to the blizzard currently hammering the US Northeast, the Broad is shut down and the GATK forum will be mostly unattended while we hunker down and sip hot cocoa with marshmallows. Assuming the power stays on and we're able to dig ourselves out of the snow when it's all over, normal service should resume Wednesday or Thursday.

ReduceReads is time consuming,

ugoodlfyugoodlfy Posts: 2Member
edited February 2013 in Ask the GATK team

HI all,

I am analyzing some whole genome sequencing datas .After preprocessing by Queue got a large bam file on sample level (~ 200GB/sample ) and I wanted to use ReaduceReads module to reduce the bam file size. and running following command: /usr/java/latest/bin/java -Xmx16g -jar /path_to_GenomeAnalysisTK-2.3-9/GenomeAnalysisTK.jar -R /path_to_human_g1k_v37.fasta -T ReduceReads -I /path_to_Queue/project.sample.clean.dedup.recal.bam -o sample.reduced.bam --generate_md5

After 8 hours , the estimated time goes to 6.9 days.

INFO 20:02:25,508 ProgressMeter - 1:120660726 5.63e+07 6.5 h 7.0 m 3.9% 7.0 d 6.7 d INFO 20:03:25,509 ProgressMeter - 1:120660726 5.63e+07 6.5 h 7.0 m 3.9% 7.0 d 6.7 d INFO 20:04:25,510 ProgressMeter - 1:120660726 5.63e+07 6.6 h 7.0 m 3.9% 7.0 d 6.8 d INFO 20:05:25,511 ProgressMeter - 1:120660726 5.63e+07 6.6 h 7.0 m 3.9% 7.0 d 6.8 d INFO 20:06:25,512 ProgressMeter - 1:120677835 5.63e+07 6.6 h 7.0 m 3.9% 7.1 d 6.8 d INFO 20:07:25,528 ProgressMeter - 1:120677835 5.63e+07 6.6 h 7.0 m 3.9% 7.1 d 6.8 d INFO 20:08:25,529 ProgressMeter - 1:120677835 5.63e+07 6.6 h 7.1 m 3.9% 7.1 d 6.8 d INFO 20:09:25,530 ProgressMeter - 1:120677835 5.63e+07 6.6 h 7.1 m 3.9% 7.1 d 6.8 d INFO 20:10:25,531 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.1 m 3.9% 7.1 d 6.9 d INFO 20:11:25,532 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.1 m 3.9% 7.2 d 6.9 d INFO 20:12:25,533 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.1 m 3.9% 7.2 d 6.9 d INFO 20:13:25,534 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.2 m 3.9% 7.2 d 6.9 d INFO 20:14:25,535 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.2 m 3.9% 7.2 d 6.9 d

The tool version is GenomeAnalysisTK-2.3-9

Is there anything wrong with my command ? How could I speed up this procedure? Thanks a lot .

Post edited by ugoodlfy on
Tagged:

Best Answer

Answers

Sign In or Register to comment.