Problem running RealignerTargetCreator

I have a problem running RealignerTargetCreator. I tried to carry out a variants analysis following your best practices pipeline.
After mapping my reads to hg19 reference genome with bowtie and convert the results to sorted .bam format;
I use Picard/Markduplicates on the sorted .bam file to mark PCR duplicates as previous step.
And then I try to run RealignerTargetCreator on the resulting file (sample_sorted.marked.bam) which is about 1 Gb of size.

This is the running command :

java -Xmx4g -jar /programs/GenomeAnalysisTK.jar -T RealignerTargetCreator -R /Genomes/hg19/hg19.fa -o /GATK_tryouts/sample_sorted.bam.list -I /GATK_tryouts/sample_sorted.marked.bam --num_threads 6

The program seems to start running smoothly, but it does never end (I had it running about 9 days before killing the process). There's no trace of any output in the directory where it was suppose to appear (while it's running).

This is the STDOUT message printed on the screen while it's suposse to be running:

INFO 15:21:00,592 HelpFormatter - --------------------------------------------------------------------------------
INFO 15:21:00,595 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.4-9-g532efad, Compiled 2013/03/19 07:35:36
INFO 15:21:00,595 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 15:21:00,595 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 15:21:00,601 HelpFormatter - Program Args: -T RealignerTargetCreator -R /Genomes/hg19/hg19.fa -o /GATK_tryouts/sample_sorted.bam.list -I /GATK_tryouts/sample_sorted.marked.bam--num_threads 6

INFO 15:21:00,601 HelpFormatter - Date/Time: 2013/04/08 15:21:00
INFO 15:21:00,601 HelpFormatter - --------------------------------------------------------------------------------
INFO 15:21:00,601 HelpFormatter - --------------------------------------------------------------------------------
INFO 15:21:01,489 GenomeAnalysisEngine - Strictness is SILENT

But as I said, nothing happens apart from this message.

Is there something wrong with the command used to run the program? How can I know if it's working?
Is there any orientation on how much time may this process take with 6 threads and 4 or 8 Gb RAM?

I'm really worried with this issue since it blocks my whole analysis, so please, help me if you can.

Best wishes

JL

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi JL,

    There is definitely something wrong here. This analysis should not take that long, and even if it did, it would output regular progress estimates. Have you tried running it single-threaded to test whether it is the multi-threading that is causing the hang? Some platforms do not handle multithreading very well.

  • jllavinjllavin Member
    edited April 2013

    Thank you for your swift answer Geraldine,

    The truth is that the first time I run it I did it single-threaded, and it happened the same. That's why I decided to run it multi-threaded...
    I've just launched the single threaded analysis once more, changing the sample_sorted.marked.bam for another file of the same kind ( created following the same procedure as described previously in this post).

    My system configuration is the following (in case it helps):

    LSB Version: :core-3.1-amd64:core-3.1-ia32:core-3.1-noarch:graphics-3.1-amd64:graphics-3.1-ia32:graphics-3.1-noarch
    Distributor ID: ScientificSL
    Description: Scientific Linux SL release 5.5 (Boron)
    Release: 5.5

    Thank for your kind support Geraldine, I hope we can fix this ;)

Sign In or Register to comment.