Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

0/0 genotypes in GenotypeGVCFs

Hi,

I want to call both confident variants and confident homREF genotypes using gatk 3.x style (pretty much like EMIT_ALL_CONFIDENT_SITES in UnifiedGenotyper). I first use HaplotypeCaller with -ERC GVCF, then use GenotypeGVCFs -inv to emit all confident sites. However, in the resulting vcf file, there is no site with a 0/0 genotype for all samples in the jointly called vcf, which contains only variant sites and sites with ./. genotype for all samples. I do think all sites are in the vcf file. Why can't I find any site with 0/0 genotype for all samples in the vcf from GenotypeGVCFs? Or do I need to run HaplotypeCaller with -ERC BP_RESOLUTION first? Thanks!

aaron

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @aaronchu‌

    Hi Aaron,

    Can you please post a few records?

    Thanks,
    Sheila

  • aaronchuaaronchu USMember

    I attached six gvcf files for which I ran GenotypeGVCFs with -inv to jointly genotype all confident sites. GenotypeGVCFs command:

    java -jar /programs/GenomeAnalysisTK-3.1-1/GenomeAnalysisTK.jar -nt 4 -R hs37d5.fa -T GenotypeGVCFs -inv \
    -V /gvcf/SQC0208F68.hc.gvcf \
    -V /gvcf/SQC0209F68.hc.gvcf \
    -V /gvcf/SQC0210F68.hc.gvcf \
    -V /gvcf/SQC0214F68.hc.gvcf \
    -V /gvcf/SQC0215F68.hc.gvcf \
    -V /gvcf/SQC0216F68.hc.gvcf \
    -D /gatk/gatk_bundle_2.5_b37/dbsnp_137.b37.vcf -o sqcf68.hc.raw.vcf.gz

    I also provided the 0/0 calls from UnifiedGenotyper EMIT_ALL_CONFIDENT_SITES (only SNPs). For all these sites GenotypeGVCFs generates a vcf where there is ./. instead of 0/0.

    Thanks for looking into this!

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @aaronchu‌

    Hi,

    It turns out this is a limitation of the GATK right now, and we are working to improve it in a future version.

    -Sheila

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    You can also refer here for a similar Forum question. This may help to clarify things: http://gatkforums.broadinstitute.org/discussion/comment/14070#Comment_14070

  • aaronchuaaronchu USMember

    Thanks Sheila! The referred question does have relevance here. It seems the current gatk will not calculate confidence for invariant sites, so I can currently get only confident 0/0 for sites that show at least one variant allele call in the population that is jointly called.
    -aaron

Sign In or Register to comment.