The frontline support team will be slow on the forum because we are occupied with the GATK Workshop on March 21st and 22nd 2019. We will be back and more available to answer questions on the forum on March 25th 2019.

VariantRecalibrator returning empty result files, no error, just "Killed"

LindsayLiangLindsayLiang Member
edited September 2017 in Ask the GATK team

Hi, I'm running the VariantRecalibrator step on a pretty small data set (50 samples in the cohort, but only for Chr21 from a whole exome sequencing project), and GATK is returning empty result files (without throwing errors), and is terminating early.

The output is as follows:

INFO  17:53:58,900 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  17:53:58,908 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.7-0-gcfedb67, Compiled 2016/12/12 11:21:18 
INFO  17:53:58,908 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute 
INFO  17:53:58,909 HelpFormatter - For support and documentation go to 
INFO  17:53:58,909 HelpFormatter - [Thu Sep 21 17:53:58 UTC 2017] Executing on Linux 4.9.41-moby amd64 
INFO  17:53:58,910 HelpFormatter - OpenJDK 64-Bit Server VM 1.8.0_102-8u102-b14.1-1~bpo8+1-b14 
INFO  17:53:58,914 HelpFormatter - Program Args: -T VariantRecalibrator -R /vqsr_snp_model/localDir/human_g1k_v37.fasta -nt 8 -mode SNP -input /vqsr_snp_model/localDir/ -recalFile /vqsr_snp_model/localDir/Output/ -tranchesFile /vqsr_snp_model/localDir/Output/ -rscriptFile /vqsr_snp_model/localDir/Output/ --use_annotation QD --use_annotation MQ --use_annotation MQRankSum --use_annotation FS --use_annotation SOR --resource:hapmap,known=false,training=true,truth=true,prior=15.0 /vqsr_snp_model/localDir/hapmap_3.3.b37.vcf --resource:omni,known=false,training=true,truth=true,prior=12.0 /vqsr_snp_model/localDir/1000G_omni2.5.b37.vcf --resource:1000G,known=false,training=true,truth=false,prior=10.0 /vqsr_snp_model/localDir/1000G_phase1.snps.high_confidence.b37.vcf --resource:dbsnp,known=true,training=false,truth=false,prior=2.0 /vqsr_snp_model/localDir/dbsnp_138.b37.vcf 
INFO  17:53:58,929 HelpFormatter - Executing as [email protected] on Linux 4.9.41-moby amd64; OpenJDK 64-Bit Server VM 1.8.0_102-8u102-b14.1-1~bpo8+1-b14. 
INFO  17:53:58,929 HelpFormatter - Date/Time: 2017/09/21 17:53:58 
INFO  17:53:58,930 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  17:53:58,930 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  17:53:58,983 GenomeAnalysisEngine - Strictness is SILENT 
INFO  17:53:59,163 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 
INFO  17:53:59,805 MicroScheduler - Running the GATK in parallel mode with 8 total threads, 1 CPU thread(s) for each of 8 data thread(s), of 4 processors available on this machine 
WARN  17:53:59,805 MicroScheduler - Number of requested GATK threads 8 is more than the number of available processors on this machine 4 
INFO  17:54:00,034 GenomeAnalysisEngine - Preparing for traversal 
INFO  17:54:00,042 GenomeAnalysisEngine - Done preparing for traversal 
INFO  17:54:00,043 ProgressMeter -                 | processed |    time |    per 1M |           |   total | remaining 
INFO  17:54:00,043 ProgressMeter -        Location |     sites | elapsed |     sites | completed | runtime |   runtime 
INFO  17:54:00,054 TrainingSet - Found hapmap track:    Known = false   Training = true     Truth = true    Prior = Q15.0 
INFO  17:54:00,054 TrainingSet - Found omni track:  Known = false   Training = true     Truth = true    Prior = Q12.0 
INFO  17:54:00,055 TrainingSet - Found 1000G track:     Known = false   Training = true     Truth = false   Prior = Q10.0 
INFO  17:54:00,055 TrainingSet - Found dbsnp track:     Known = true    Training = false    Truth = false   Prior = Q2.0 

The input was:

java -Xmx12g 
-jar /usr/GenomeAnalysisTK.jar 
-T VariantRecalibrator 
-R human_g1k_v37.fasta
-nt 8 
-mode SNP
--use_annotation QD 
--use_annotation MQ 
--use_annotation MQRankSum 
--use_annotation FS 
--use_annotation SOR 
--use_annotation ReadPosRankSum
--resource:hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.vcf 
--resource:omni,known=false,training=true,truth=true,prior=12.0 1000G_omni2.5.b37.vcf
--resource:1000G,known=false,training=true,truth=false,prior=10.0 1000G_phase1.snps.high_confidence.b37.vcf
--resource:dbsnp,known=true,training=false,truth=false,prior=2.0" dbsnp_138.b37.vcf


Edit: I ran the same command again with all chromosomes of the whole exome sequences and the same error occured

Post edited by LindsayLiang on


Sign In or Register to comment.