To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Problem with HaplotypeCaller and GenotypeGVCFs

Hi just wondering if you have any experience with this problem.

I am following GATK best practices for a targeted sequencing experiment.

After the GenotypeGVCFs phase the majority of variants are marked "MQ=NaN"

Inspection of g.vcf reveals that the only sites which have a numerical MQ are those which are homozygous for the alternative allele

I've included below the three commands used to get to that point

I have tried this with and without -A AS_RMSMappingQuality

Thanks so much for your help!

for filename in *.bam; do $gatk -T HaplotypeCaller -nct 15 -R ucsc.hg19.fasta -I ${filename} -L bedfile.bed -ERC GVCF -o ${filename}.g.vcf;done

ls *.g.vcf > gee_vee_cee_eff.list

$gatk -T GenotypeGVCFs -R ucsc.hg19.fasta -nt 15 -V gee_vee_cee_eff.list -o GVCFs_jointcalls.vcf

Answers

  • shleeshlee CambridgeMember, Broadie, Moderator

    Hi @markdoherty,

    Can you post the records for the input GVCFs and the final cohort-VCF for the site in question? Also, when you say

    Inspection of g.vcf reveals that the only sites which have a numerical MQ are those which are homozygous for the alternative allele

    Does this mean that the MQ=NaN sites are a mix of het and hom-ref call sites? Do all of these have MQ=NaNs or just some of these?

Sign In or Register to comment.