Parallelization using intervals

Hi,
we have developed a workflow and introduced parallelization during GATK realignment and recalibration using intervals. We are running the pipeline on SMPs with 128G/512G RAM. While the hg19 intervals are limited including random chromosomes, we are having some issues with 1000 genome reference which contains many 59 GL* small chromosomes. So when we fork with -Xmx2G each around 89 intervals (which includes chromosomes and GL*), the pipeline most of the time gets terminated due to insufficient memory

For example as below:
java -Xmx2G -jar $gatk_dir -T RealignerTargetCreator $bam_dir -R $Assembly_file -known $known_vcf -L $var -o $lib_path_output/IndelRealigner_$var.intervals
The data will be merged before moving to next step.

-L option is like "3:1-198022430" etc.. for each chromosome.. Is there anyway to concatenate all randon chromosomes into one process? Please advice. Thanks.

GL000207.1:1-4262;GL000226.1:1-15008;GL000229.1:1-19913;GL000231.1:1-27386;GL000210.1:1-27682;GL000239.1:1-33824;GL000235.1:1-34474;GL000201.1:1-36148;GL000247.1:1-36422;GL000245.1:1-36651;GL000197.1:1-37175;GL000203.1:1-37498;GL000246.1:1-38154;GL000249.1:1-38502;GL000196.1:1-38914;GL000248.1:1-39786;GL000244.1:1-39929;GL000238.1:1-39939;GL000202.1:1-40103;GL000234.1:1-40531;GL000232.1:1-40652;GL000206.1:1-41001;GL000240.1:1-41933;GL000236.1:1-41934;GL000241.1:1-42152;GL000243.1:1-43341;GL000242.1:1-43523;GL000230.1:1-43691;GL000237.1:1-45867;GL000233.1:1-45941;GL000204.1:1-81310;GL000198.1:1-90085;GL000208.1:1-92689;GL000191.1:1-106433;GL000227.1:1-128374;GL000228.1:1-129120;GL000214.1:1-137718;GL000221.1:1-155397;GL000209.1:1-159169;GL000218.1:1-161147;GL000220.1:1-161802;GL000213.1:1-164239;GL000211.1:1-166566;GL000199.1:1-169874;GL000217.1:1-172149;GL000216.1:1-172294;GL000215.1:1-172545;GL000205.1:1-174588;GL000219.1:1-179198;GL000224.1:1-179693;GL000223.1:1-180455;GL000195.1:1-182896;GL000212.1:1-186858;GL000222.1:1-186861;GL000200.1:1-187035;GL000193.1:1-189789;GL000194.1:1-191469;GL000225.1:1-211173;GL000192.1:1-547496

Best Answer

Answers

Sign In or Register to comment.