To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Missing gentoypes despite high number of reads

Dear GATK-team

GenotypeGVCFs, sets all genotypes as missing for a large part of a gene I am interested in, even though there seem to be lots of reads in all individuals for those sites. I have the same problem if I am using versions 3.5 or 3.7. It also makes no difference if HaplotypeCaller was run with -ERC BP_Resolution or the standard GVCF, or if I run GenotypeGVCFs on just one individual or many at once. Could it be related to the fact that this is a mitochondrial region and thus has higher sequencing depth and theoretically no heterozygote positions?

Here is an example of the vcf file produced by GenotypeGVCFs:
chrM 16758 . A . 67.16 . DP=14015 GT:AD:DP ./.:444,9:453 ./.:257,0:257 (...)

In the g.vcf file of the first individual this site looks like this (the 9 alternative allele reads are likely contamination):
chrM 16758 . A . . . GT:AD:DP:GQ:PL 0/0:444,9:453:0:0,0,0

And in the second individual g.vcf:
chrM 16758 . A . . . GT:AD:DP:GQ:PL 0/0:257,0:257:0:0,0,0

Any advice would be highly appreciated.

Many thanks!

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @JoanaM
    Hi,

    The reason they are being set to no-calls is that the GQs are all 0. Notice the PLs are all 0. Can you post IGV screenshots of the original BAM file and bamout file? Please include ~300 bases before and after the site of interest.

    Thanks,
    Sheila

Sign In or Register to comment.