Service Notice: Due to the blizzard currently hammering the US Northeast, the Broad is shut down and the GATK forum will be mostly unattended while we hunker down and sip hot cocoa with marshmallows. Assuming the power stays on and we're able to dig ourselves out of the snow when it's all over, normal service should resume Wednesday or Thursday.

Changes in the logfile and a long wait between reading the BAMs and starting to Genotype

yfarjounyfarjoun Broad InstitutePosts: 15GATK Developer mod

I've noticed a change in the log files that now shows a period of "starting" before actually genotyping. This can be hours...or minutes... I looked in previous logs and saw that there aren't such line in the log, but the clock jumps a similar amount of time. I was wondering if this is a known phenomenon and what it the GATK doing during that time. Is there anything we can do to reduce that waiting period?

Here are examples:

old (calling with 4500 samples, glm BOTH):

INFO 12:21:37,515 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 1157.62 INFO 12:26:52,780 RMDTrackBuilder - Loading Tribble index from disk for file /seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.dbsnp.vcf WARN 12:26:54,098 VCFStandardHeaderLines$Standards - Repairing standard header line for field AF because -- count types disagree; header has UNBOUNDED but standard is A INFO 12:26:54,457 GenomeAnalysisEngine - Processing 43138 bp from intervals INFO 12:26:54,473 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 12:26:54,473 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining INFO 21:37:08,951 ProgressMeter - 17:26944207 0.00e+00 9.2 h 54587.4 w 0.0% Infinity w Infinity w INFO 21:40:40,002 ProgressMeter - 17:26945957 1.82e+02 9.2 h 301.8 w 0.9% 6.2 w 6.1 w INFO 21:41:53,697 ProgressMeter - 17:26946820 6.49e+02 9.2 h 84.8 w 1.5% 3.6 w 3.5 w

and here newer (calling 1500 sample, glm SNP):

INFO 18:42:42,025 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 24.55 INFO 18:42:47,648 RMDTrackBuilder - Loading Tribble index from disk for file /seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.dbsnp.vcf WARN 18:42:47,972 VCFStandardHeaderLines$Standards - Repairing standard header line for field AF because -- count types disagree; header has UNBOUNDED but standard is A INFO 18:42:48,192 GenomeAnalysisEngine - Processing 43138 bp from intervals INFO 18:42:48,206 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 18:42:48,206 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining INFO 18:43:18,301 ProgressMeter - starting 0.00e+00 30.1 s 49.8 w 100.0% 30.1 s 0.0 s INFO 18:43:48,310 ProgressMeter - starting 0.00e+00 60.1 s 99.4 w 100.0% 60.1 s 0.0 s INFO 18:44:18,344 ProgressMeter - starting 0.00e+00 90.1 s 149.0 w 100.0% 90.1 s 0.0 s INFO 18:44:48,353 ProgressMeter - starting 0.00e+00 2.0 m 198.7 w 100.0% 2.0 m 0.0 s INFO 18:45:18,363 ProgressMeter - starting 0.00e+00 2.5 m 248.3 w 100.0% 2.5 m 0.0 s INFO 18:45:48,373 ProgressMeter - starting 0.00e+00 3.0 m 297.9 w 100.0% 3.0 m 0.0 s INFO 18:46:18,382 ProgressMeter - starting 0.00e+00 3.5 m 347.5 w 100.0% 3.5 m 0.0 s INFO 18:46:48,391 ProgressMeter - starting 0.00e+00 4.0 m 397.1 w 100.0% 4.0 m 0.0 s INFO 18:47:18,401 ProgressMeter - starting 0.00e+00 4.5 m 446.7 w 100.0% 4.5 m 0.0 s INFO 18:47:48,411 ProgressMeter - starting 0.00e+00 5.0 m 496.4 w 100.0% 5.0 m 0.0 s INFO 18:48:18,426 ProgressMeter - starting 0.00e+00 5.5 m 546.0 w 100.0% 5.5 m 0.0 s INFO 18:48:48,436 ProgressMeter - starting 0.00e+00 6.0 m 595.6 w 100.0% 6.0 m 0.0 s INFO 18:49:18,446 ProgressMeter - starting 0.00e+00 6.5 m 645.2 w 100.0% 6.5 m 0.0 s INFO 18:49:48,455 ProgressMeter - starting 0.00e+00 7.0 m 694.9 w 100.0% 7.0 m 0.0 s INFO 18:50:18,465 ProgressMeter - starting 0.00e+00 7.5 m 744.5 w 100.0% 7.5 m 0.0 s INFO 18:50:48,475 ProgressMeter - starting 0.00e+00 8.0 m 794.1 w 100.0% 8.0 m 0.0 s INFO 18:51:18,484 ProgressMeter - starting 0.00e+00 8.5 m 843.7 w 100.0% 8.5 m 0.0 s INFO 18:51:48,510 ProgressMeter - starting 0.00e+00 9.0 m 893.4 w 100.0% 9.0 m 0.0 s INFO 18:52:18,520 ProgressMeter - starting 0.00e+00 9.5 m 943.0 w 100.0% 9.5 m 0.0 s INFO 18:52:48,530 ProgressMeter - starting 0.00e+00 10.0 m 992.6 w 100.0% 10.0 m 0.0 s INFO 18:53:18,541 ProgressMeter - starting 0.00e+00 10.5 m 1042.2 w 100.0% 10.5 m 0.0 s INFO 18:53:48,552 ProgressMeter - starting 0.00e+00 11.0 m 1091.8 w 100.0% 11.0 m 0.0 s INFO 18:54:18,561 ProgressMeter - starting 0.00e+00 11.5 m 1141.5 w 100.0% 11.5 m 0.0 s INFO 18:54:48,570 ProgressMeter - starting 0.00e+00 12.0 m 1191.1 w 100.0% 12.0 m 0.0 s INFO 18:55:18,579 ProgressMeter - starting 0.00e+00 12.5 m 1240.7 w 100.0% 12.5 m 0.0 s INFO 18:55:48,590 ProgressMeter - 18:12725582 1.00e+01 13.0 m 129.0 w 0.5% 45.4 h 45.2 h

I've also noticed that output seems to be quite wrong about how much work it has left to do...perhaps these two things are related?

I would be grateful for any information.

Best Answer

Answers

Sign In or Register to comment.