Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Problem with HaplotypeCaller and GenotypeGVCFs

Hi just wondering if you have any experience with this problem.

I am following GATK best practices for a targeted sequencing experiment.

After the GenotypeGVCFs phase the majority of variants are marked "MQ=NaN"

Inspection of g.vcf reveals that the only sites which have a numerical MQ are those which are homozygous for the alternative allele

I've included below the three commands used to get to that point

I have tried this with and without -A AS_RMSMappingQuality

Thanks so much for your help!

for filename in *.bam; do $gatk -T HaplotypeCaller -nct 15 -R ucsc.hg19.fasta -I ${filename} -L bedfile.bed -ERC GVCF -o ${filename}.g.vcf;done

ls *.g.vcf > gee_vee_cee_eff.list

$gatk -T GenotypeGVCFs -R ucsc.hg19.fasta -nt 15 -V gee_vee_cee_eff.list -o GVCFs_jointcalls.vcf

Answers

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @markdoherty,

    Can you post the records for the input GVCFs and the final cohort-VCF for the site in question? Also, when you say

    Inspection of g.vcf reveals that the only sites which have a numerical MQ are those which are homozygous for the alternative allele

    Does this mean that the MQ=NaN sites are a mix of het and hom-ref call sites? Do all of these have MQ=NaNs or just some of these?

Sign In or Register to comment.