Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

VariantRecalibrator Error

My task:

task VariantRecalibratorSNPs {

  File Raw_VCF
  String CohortName
  String Chromosome
  File? Parallelization
  String? InbreedingCoeff

  Map[String, String] Paths
  Array[String] RuntimeParams

  command {
    ${Paths["java"]} -Xmx4G -jar ${Paths["gatk"]} \
      -T VariantRecalibrator \
      -R ${Paths["refFasta"]} \
      -input ${Raw_VCF} \
      -recalFile ${CohortName}_${Chromosome}_SNPs.recal \
      -tranchesFile ${CohortName}_${Chromosome}_SNPs.tranches \
      -nt 4 \
      -L ${default=Chromosome Parallelization} \
      -resource:hapmap,known=false,training=true,truth=true,prior=15.0 ${Paths["hapmap"]} \
      -resource:omni,known=false,training=true,truth=true,prior=12.0 ${Paths["omni"]} \
      -resource:1000G,known=false,training=true,truth=false,prior=10.0 ${Paths["1000G"]} \
      -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 ${Paths["dbsnp"]} \
      -an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR -an DP ${default="" InbreedingCoeff}\
      -mode SNP \
  }

  runtime {
    runtime_minutes: RuntimeParams[3]
    cpus: RuntimeParams[12]
    requested_memory_mb_per_core: RuntimeParams[21]
    queue: RuntimeParams[30]
  }

  output {
    File recal_SNPs_VCF = "${CohortName}_${Chromosome}_SNPs.recal"
    File tranches_SNPs_VCF = "${CohortName}_${Chromosome}_SNPs.tranches"
  }
}

GATK Error (full stderr attached):

##### ERROR --
##### ERROR stack trace
org.broadinstitute.gatk.utils.exceptions.ReviewedGATKException: Unable to retrieve result
        at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:190)
        at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:323)
        at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:123)
        at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:256)
        at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:158)
        at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:108)
Caused by: java.lang.IllegalArgumentException: No data found.
        at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibratorEngine.generateModel(VariantRecalibratorEngine.java:88)
        at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:536)
        at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:191)
        at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.notifyTraversalDone(HierarchicalMicroScheduler.java:226)
        at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:183)
        ... 5 more
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 3.8-0-ge9d806836):
##### ERROR
##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
##### ERROR Visit our website and forum for extensive documentation and answers to
##### ERROR commonly asked questions https://software.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Unable to retrieve result
##### ERROR ------------------------------------------------------------------------------------------

Interesting to note, my VariantRecalibrator for Indels works just fine:

task VariantRecalibratorIndels {

  File Raw_VCF
  String CohortName
  String Chromosome
  File? Parallelization
  String? InbreedingCoeff

  Map[String, String] Paths
  Array[String] RuntimeParams

  command {
    ${Paths["java"]} -Xmx4G -jar ${Paths["gatk"]} \
      -T VariantRecalibrator \
      -R ${Paths["refFasta"]} \
      -input ${Raw_VCF} \
      -recalFile ${CohortName}_${Chromosome}_Indels.recal \
      -tranchesFile ${CohortName}_${Chromosome}_Indels.tranches \
      -nt 4 \
      -L ${default=Chromosome Parallelization} \
      --maxGaussians 4 \
      -resource:mills,known=false,training=true,truth=true,prior=12.0 ${Paths["mills"]} \
      -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 ${Paths["dbsnp"]} \
      -an QD -an DP -an FS -an SOR -an ReadPosRankSum -an MQRankSum ${default="" InbreedingCoeff} \
      -mode INDEL \
  }

  runtime {
    runtime_minutes: RuntimeParams[4]
    cpus: RuntimeParams[13]
    requested_memory_mb_per_core: RuntimeParams[22]
    queue: RuntimeParams[31]
  }

  output {
    File recal_Indels_VCF = "${CohortName}_${Chromosome}_Indels.recal"
    File tranches_Indels_VCF = "${CohortName}_${Chromosome}_Indels.tranches"
  }
}

I am running all the most recent versions. Attaching my entire script for reference.

Thanks a lot,

Alon

Best Answer

Answers

Sign In or Register to comment.