Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

GenotypeGVCFs shows <NON-REF> flag despite all reads in all samples supporting the reference allele

Hello,
I have been jointly genotyping variants in my samples using GenotypeGVCFs but a few sites show a flag despite all reads in all samples supporting the reference allele. Do you know why this might be the case?

An example:
chr2 37521 . T 66.18 PASS AC=0;AF=0.00;AN=30;DP=320;InbreedingCoeff=-0.0000;MLEAC=0;MLEAF=0.00;MQ=64.28 GT:AD:DP:RGQ 0/0:22,0:22:63 0/0:27,0:27:66 0/0:17,0:17:48 0/0:17,0:17:42 0/0:31,0:31:63 0/0:25,0:25:60 0/0:15,0:15:36 0/0:22,0:22:63 0/0:26,0:26:72 0/0:22,0:22:60 0/0:28,0:28:81 0/0:22,0:22:60 0/0:20,0:20:57 0/0:12,0:12:33 0/0:14,0:14:42
Thank you in advance for your help.

Tagged:

Answers

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭
    edited March 2018

    Hi @w_anderson,

    Can you be more specific about what you mean by flag? I see the record you posted has no variant allele, just the REF allele:

    chr2 37521 . T 66.18 PASS 
    

    Can you tell us which version of GATK you are using and post the exact GenotypeGVCFs and HaplotypeCaller commands that gave you this result? Thanks.

  • w_andersonw_anderson USMember
    edited March 2018

    I am sorry, it didn't seem to have posted correctly but there is a "NON-REF" tag in the ALT entry of the vcf:

    chr2 37521 . T NON_REF 66.18 PASS AC=0;AF=0.00;AN=30;DP=320;InbreedingCoeff=-0.0000;MLEAC=0;MLEAF=0.00;MQ=64.28 GT:AD:DP:RGQ 0/0:22,0:22:63 0/0:27,0:27:66 0/0:17,0:17:48 0/0:17,0:17:42 0/0:31,0:31:63 0/0:25,0:25:60 0/0:15,0:15:36 0/0:22,0:22:63 0/0:26,0:26:72 0/0:22,0:22:60 0/0:28,0:28:81 0/0:22,0:22:60 0/0:20,0:20:57 0/0:12,0:12:33 0/0:14,0:14:42
    

    We were following the GATK 3.6 Best Practises, using the following command:

    java -jar GenomeAnalysisTK.jar -T GenotypeGVCFs \
    -R reference.fasta \
    --variant ind1.g.vcf \
    --variant ind2.g.vcf (...) \
    --variant ind15.g.vcf \
    -o output.vcf \
    --allSites \
    -nt 6
    

    I appreciate that this is not the latest version but we have been working on this dataset for a while and can't re-run the pipeline with the newest version.

    Post edited by shlee on
  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    @w_anderson,

    There appear to be a couple of inconsistencies in what you've posted. First, for a GVCF, the NON_REF allele ought to be symbolic, i.e. indicated by <NON_REF> as per VCF specifications.

    Second, your GenotypeGVCFs command uses the --allSites option. I asked previously that you also post your HaplotypeCaller command because these two steps go hand-in-hand. Did you use the -ERC BP_RESOLUTION mode of HaplotypeCaller? As this is the mode that should be used with a GenotypeGVCFs --allSites option.

  • w_andersonw_anderson USMember

    Yes, the<NON_REF> allele is indeed symbolic.

    The HaplotypeCaller used GATK version 3.5 and the command was:

    java -jar GenomeAnalysisTK.jar -T HaplotypeCaller 
    -R  reference.fasta
    -I ind1.bam 
    --genotyping_mode DISCOVERY 
    --emitRefConfidence BP_RESOLUTION 
    -o ind1.g.vcf
    
Sign In or Register to comment.