To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

VariantRecalibrator: Values for MQRankSum annotation not detected for ANY training variant in the i

Hi, I am running gatk, I met some problems, please help me find the solution thank you.
when I run HaplotypeCaller, I used the code like the following:
java -Xmx2g -jar $GATK -T HaplotypeCaller -R $genome -I L02.dup.relig.recal.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o raw_variants.vcf -log HaplotypeCaller.log
java -Xmx2g -jar $GATK -T VariantRecalibrator -R $genome -input raw_variants.vcf -resource:hapmap,known=false,training=true,truth=true,prior=15.0 $hapmap -resource:omni,known=false,training=true,truth=false,prior=12.0 $omni -resource:1000G,known=false,training=true,truth=false,prior=10.0 $phase1snp -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 $dbsnp -an DP -an QD -an FS -an MQRankSum -an ReadPosRankSum -mode SNP -tranche 100.0 -tranche 99.0 -tranche 99.0 -tranche 90.0 -recalFile L02_snp.recal -tranchesFile L02_snp.tranches -rscriptFile L02_snp.plots.R

then I run VariantRecalibrator, it shows error in the last , like following:
Bad input: Values for MQRankSum annotation not detected for ANY training variant in the input callset. VariantAnnotator may be used to add these annotations, the same problem also occur for -an ReadPosRankSum.
I run VariantAnnotator as the massage mentioned like this:java -Xmx2g -jar ../../../softwares/GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar -T VariantAnnotator -R /media/shy/shy1/mystuff/GATK_ref/ucsc_hg19/ucsc.hg19.fasta -V raw_variants.vcf --dbsnp ../../GATK_ref/hg19/dbsnp_138.hg19.vcf -o raw_variants_anno.vcf -A BaseCounts -A BaseQualityRankSumTest -A ChromosomeCounts -A FisherStrand -A GCContent -A HaplotypeScore -A MappingQualityRankSumTest -A ReadPosRankSumTest -A RMSMappingQuality -A QualByDepth -A AlleleBalanceBySample -A DepthPerAlleleBySample -nt 4
and then run VariantRecalibrator again, it doesn't work.

How to solve this problem ? the GATK version is 3.5
Thank you !

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @gabengdou
    Hi,

    Can you tell us what is in your input VCF? How many samples and are they whole exome or whole genome? Can you confirm that the MQRankSum annotation is present for some of the sites?

    Thanks,
    Sheila

  • Hi,
    I come across the same error while running GATK v3.7. I did joint genotyping on 291 exomes using HaplotypeCaller, and then ran VariantReCalibrator. Posting my input command and the error below.
    java -jar /mnt/boubyan/OPT/Software/GATK/GenomeAnalysisTK.jar -T VariantRecalibrator -R /mnt/boubyan/data/Genomics/Gaurav/Genome/ucsc_hg19/ucsc.hg19.fasta -input /mnt/alpha/KW_exomes/291_KW_exomes/GVCF_ann_files/combined_vcf/291.vcf -resource:hapmap,known=false,training=true,truth=true,prior=15.0 /mnt/boubyan/OPT/Software/GATK/resources/hapmap_3.3.hg19.sites.vcf -resource:omni,known=false,training=true,truth=true,prior=12.0 /mnt/boubyan/OPT/Software/GATK/resources/1000G_omni2.5.hg19.sites.vcf -resource:1000G,known=false,training=true,truth=false,prior=10.0 /mnt/boubyan/OPT/Software/GATK/resources/1000G_phase1.snps.high_confidence.hg19.sites.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 /mnt/boubyan/OPT/Software/GATK/resources/dbsnp_138.hg19.vcf -an QD -an FS -an SOR -an MQ -an MQrankSum -an ReadPosRankSum -an InbreedingCoeff -mode SNP -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 -recalFile /mnt/alpha/KW_exomes/291_KW_exomes/GVCF_ann_files/combined_vcf/VQSR_FILTERED/recalibrate_SNP.recal -tranchesFile /mnt/alpha/KW_exomes/291_KW_exomes/GVCF_ann_files/combined_vcf/VQSR_FILTERED/recalibrate_SNP.tranches -rscriptFile /mnt/alpha/KW_exomes/291_KW_exomes/GVCF_ann_files/combined_vcf/VQSR_FILTERED/recalibrate_SNP_plots.R

    ERROR ------------------------------------------------------------------------------------------
    ERROR A USER ERROR has occurred (version 3.7-0-gcfedb67):
    ERROR
    ERROR This means that one or more arguments or inputs in your command are incorrect.
    ERROR The error message below tells you what is the problem.
    ERROR
    ERROR If the problem is an invalid argument, please check the online documentation guide
    ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
    ERROR
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions https://software.broadinstitute.org/gatk
    ERROR
    ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
    ERROR
    ERROR MESSAGE: Bad input: Values for MQrankSum annotation not detected for ANY training variant in the input callset. VariantAnnotator may be used to add these annotations.

    Please help,
    Thanks

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @sej1985
    Hi,

    Can you confirm that MQRankSum is indeed present in your VCF records? What happens in you run ValidateVariants on your input VCF?
    Have a look at this thread for some more tips.

    One last think you can try is adding `--MQCapForLogitJitterTransform to your command.

    -Sheila

    P.S. Can you also try setting --maxGaussians to a smaller and check if that helps?

Sign In or Register to comment.