Problem with HaplotypeCaller and GenotypeGVCFs

Hi just wondering if you have any experience with this problem.

I am following GATK best practices for a targeted sequencing experiment.

After the GenotypeGVCFs phase the majority of variants are marked "MQ=NaN"

Inspection of g.vcf reveals that the only sites which have a numerical MQ are those which are homozygous for the alternative allele

I've included below the three commands used to get to that point

I have tried this with and without -A AS_RMSMappingQuality

Thanks so much for your help!

for filename in *.bam; do $gatk -T HaplotypeCaller -nct 15 -R ucsc.hg19.fasta -I ${filename} -L bedfile.bed -ERC GVCF -o ${filename}.g.vcf;done

ls *.g.vcf > gee_vee_cee_eff.list

$gatk -T GenotypeGVCFs -R ucsc.hg19.fasta -nt 15 -V gee_vee_cee_eff.list -o GVCFs_jointcalls.vcf

Answers

  • shleeshlee CambridgeMember, Broadie, Moderator

    Hi @markdoherty,

    Can you post the records for the input GVCFs and the final cohort-VCF for the site in question? Also, when you say

    Inspection of g.vcf reveals that the only sites which have a numerical MQ are those which are homozygous for the alternative allele

    Does this mean that the MQ=NaN sites are a mix of het and hom-ref call sites? Do all of these have MQ=NaNs or just some of these?

Sign In or Register to comment.