Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

Realigner target creator takes to lon time for chromosome 1

monsunmonsun Posts: 4Member

Hello,

I try to run the realigner target creator with the data divided on chromosomes. It seems to run smooth but the runtime for chromosome 1 seems to never end. I have checked my bam files and as expected the one with chr 1 is a little bit bigger than chr 2 but nothing proportional to the differences in predicted runtime. The version i use is GATK 2.3.0 and this is how i run it:

java -Xmx12g -jar programs/GenomeAnalysisTK-2.3-0/GenomeAnalysisTK.jar -l INFO -T VariantRecalibrator -R Homo_sapiens.GRCh37.57_dna_concat.fa -recalFile allchr_varrecal_BOTH_comb_ref.intervals -rscriptFile 15_allchr_varrecal_BOTH_comb_ref.intervals.plots.R -tranchesFile allchr_varrecal_BOTH_comb_ref.intervals.tranches -resource:hapmap,VCF,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.sites.vcf -resource:omni,VCF,known=false,training=true,truth=false,prior=12.0 1000G_omni2.5.b37.sites.vcf -resource:dbsnp,VCF,known=true,training=false,truth=false,prior=8.0 dbsnp_135.b37.vcf -resource:mills,VCF,known=true,training=true,truth=true,prior=12.0 Mills_and_1000G_gold_standard.indels.hg19.sites.vcf -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ --mode BOTH -nt 8 -input allchr_real_recal_resrt_raw_BOTH_comb_ref.vcf -L Agilent_SureSelect.V4.GRCh37.70_targets_nochr.bed.pad100.interval_list --pedigreeValidationType SILENT --pedigree my_fam.fam

predicted runtime looks like:

INFO 15:11:33,778 ProgressMeter - 1:33587200 3.36e+07 2.3 h 4.1 m 1.1% 8.8 d 8.7 d INFO 15:11:33,778 ProgressMeter - 11:73495886 1.82e+09 2.3 h 4.5 s 61.0% 3.7 h 87.4 m INFO 15:11:33,778 ProgressMeter - 10:6623707 1.68e+09 2.3 h 4.9 s 54.5% 4.2 h 114.3 m INFO 15:11:33,778 ProgressMeter - 11:572298 1.82e+09 2.3 h 4.5 s 58.7% 3.9 h 96.4 m INFO 15:11:33,779 ProgressMeter - 11:122078003 1.82e+09 2.3 h 4.5 s 62.6% 3.6 h 81.7 m INFO 15:11:53,780 ProgressMeter - 11:10589211 1.82e+09 2.3 h 4.5 s 59.0% 3.9 h 95.3 m INFO 15:11:53,780 ProgressMeter - 11:105031869 1.82e+09 2.3 h 4.5 s 62.1% 3.7 h 83.9 m INFO 15:11:53,781 ProgressMeter - 11:46994477 1.82e+09 2.3 h 4.5 s 60.2% 3.8 h 90.8 m INFO 15:12:03,779 ProgressMeter - 11:8677862 1.82e+09 2.3 h 4.5 s 58.9% 3.9 h 95.7 m INFO 15:12:03,779 ProgressMeter - 11:130690257 1.82e+09 2.3 h 4.5 s 62.9% 3.6 h 81.1 m INFO 15:12:13,779 ProgressMeter - 11:84816026 1.82e+09 2.3 h 4.5 s 61.4% 3.7 h 86.4 m INFO 15:12:13,779 ProgressMeter - 10:17039143 1.68e+09 2.3 h 4.9 s 54.8% 4.2 h 113.3 m INFO 15:12:23,780 ProgressMeter - 11:18556991 1.82e+09 2.3 h 4.5 s 59.3% 3.9 h 94.6 m INFO 15:12:23,780 ProgressMeter - 11:113365440 1.82e+09 2.3 h 4.5 s 62.3% 3.7 h 83.2 m INFO 15:12:23,782 ProgressMeter - 11:55284794 1.82e+09 2.3 h 4.5 s 60.4% 3.8 h 90.1 m INFO 15:12:33,780 ProgressMeter - 1:33783808 3.38e+07 2.3 h 4.1 m 1.1% 8.8 d 8.7 d INFO 15:12:33,780 ProgressMeter - 12:3311608 1.95e+09 2.3 h 4.2 s 63.1% 3.6 h 80.5 m INFO 15:12:43,779 ProgressMeter - 11:93292307 1.82e+09 2.3 h 4.6 s 61.7% 3.7 h 85.8 m INFO 15:12:43,780 ProgressMeter - 10:24841443 1.68e+09 2.3 h 4.9 s 55.1% 4.2 h 112.5 m INFO 15:12:43,780 ProgressMeter - 11:19408689 1.82e+09 2.3 h 4.6 s 59.3% 3.9 h 94.8 m INFO 15:12:53,965 ProgressMeter - 11:26512308 1.82e+09 2.3 h 4.6 s 59.5% 3.9 h 94.0 m INFO 15:12:53,965 ProgressMeter - 11:121738354 1.82e+09 2.3 h 4.6 s 62.6% 3.7 h 82.6 m INFO 15:12:53,965 ProgressMeter - 11:63468191 1.82e+09 2.3 h 4.6 s 60.7% 3.8 h 89.4 m INFO 15:13:03,781 ProgressMeter - 12:11960820 1.95e+09 2.3 h 4.3 s 63.4% 3.6 h 79.8 m INFO 15:13:13,780 ProgressMeter - 11:101551879 1.82e+09 2.3 h 4.6 s 61.9% 3.7 h 85.1 m INFO 15:13:13,780 ProgressMeter - 10:32238974 1.68e+09 2.3 h 4.9 s 55.3% 4.2 h 111.8 m INFO 15:13:13,780 ProgressMeter - 11:27204534 1.82e+09 2.3 h 4.6 s 59.5% 3.9 h 94.1 m INFO 15:13:23,966 ProgressMeter - 11:71529440 1.82e+09 2.3 h 4.6 s 61.0% 3.8 h 88.8 m INFO 15:13:23,968 ProgressMeter - 11:34170083 1.82e+09 2.3 h 4.6 s 59.8% 3.9 h 93.4 m INFO 15:13:23,977 ProgressMeter - 11:129889715 1.82e+09 2.3 h 4.6 s 62.9% 3.7 h 81.9 m INFO 15:13:33,781 ProgressMeter - 1:33980416 3.40e+07 2.3 h 4.1 m 1.1% 8.8 d 8.7 d

Any suggestions?

Regards,

Måns

Best Answers

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,235Administrator, GSA Member admin

    Hi there,

    Your question is about RealignerTargetCreator but your command line is for VariantRecalibrator... Depending on which it is you're having problems with the recommendations are very different. So please check what you're writing before posting your question.

    Geraldine Van der Auwera, PhD

  • monsunmonsun Posts: 4Member
    edited March 2013

    Sorry! I was a bit fast there, this is the command line:

    java -Xmx3g -Djava.io.tmpdir=$SLURM_JOB_ID/1/ -jar GenomeAnalysisTK-2.3-0/GenomeAnalysisTK.jar -l INFO -T RealignerTargetCreator -R Homo_sapiens.GRCh37.57_dna_concat.fa -known 1000G_phase1.indels.hg19.vcf -known Mills_and_1000G_gold_standard.indels.hg19.sites.vcf -I my_individual_sorted_pmd_chr1.bam -o my_individual_sorted_pmd_chr1_real.intervals &
    

    Do not know if it's worth mentioning, but the run is done on 8 cores simultaneous, one chromosome for each core.

    Thank you for the fast reply!

    Post edited by Geraldine_VdAuwera on
  • monsunmonsun Posts: 4Member

    OK, this is what I can see about downsampling in the output:

    INFO 12:54:33,665 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000, Using the new downsampling implementation

    I don't find a downsampling flag in http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_indels_RealignerTargetCreator.html how do i change this?

    However i will try and do as you say with running this chromosome on its own node.

    Thank you!

Sign In or Register to comment.