If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

UnifiedGenotyper miss some alleles while using GENOTYPE_GIVEN_ALLELES mode

jchoojchoo Member
edited July 2017 in Ask the GATK team

Dear GATK team,
We are using UnifiedGenotyper GENOTYPE_GIVEN_ALLELES modes to do genotyping, but we found that not all given alleles were genotyped.
For example, the input vcf is:

    13      20763485        .       AG      A       30      PASS    AC=1;AF=0.500;AN=2;set=variant2 GT:GQ   ./.     0/1:30
    13      20763485        .       A       G       30      PASS    AC=1;AF=0.500;AN=2;set=variant  GT:GQ   0/1:30  ./.

in the output vcf, we only got genotypes:

13  20763485    rs80338943  AG  A   0   LowQual AC=0;AF=0.00;AN=2;BaseQRankSum=1.075;DB;DP=820;ExcessHet=3.0103;FS=0.000;MLEAC=0;MLEAF=0.00;MQ=59.95;MQ0=0;MQRankSum=0.074;RPA=3,2;RU=G;ReadPosRankSum=1.087;SOR=1.546;STR  GT:AD:DP:GQ:PL  0/0:819,1:820:99:0,2462,36389

another allele were missed
the command we used is:

java -jar GenomeAnalysisTK-3.6.jar -T UnifiedGenotyper -mbq 10 -stand_call_conf 20 -dt NONE -R hs37d5.fa -I S44-EL-20-1.recal.bam -D dbsnp147_GRCH37_All_20160601.vcf --genotyping_mode GENOTYPE_GIVEN_ALLELES --alleles input.vcf -L input.vcf -o output.vcf --output_mode EMIT_ALL_SITES

Is there any options we can make UnifiedGenotyper output all alleles?
Thanks a lot

Post edited by shlee on


  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @jchoo,

    When using --genotyping_mode GENOTYPE_GIVEN_ALLELES I believe the allele representations must match exactly. Is it possible that the --alleles file only has the first variant but not the second?

    Also, is there a reason why you are not using HaplotypeCaller?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin


    Can you also post an IGV Screenshot of the BAM file at that position? If the A/G SNP is not present in the BAM file, it will not be output in the VCF.


  • @shlee @Sheila I found if we combine two variants into one line (multiple allele), GATK can output this three genotype results, but another question is the GQ of this variants is 0.
    9 34648361 rs111033738 GC AC,G 0 LowQual AC=0,0;AF=0.00,0.00;AN=2;DB;DP=663;ExcessHet=3.0103;FS=0.000;MLEAC=0,0;MLEAF=0.00,0.00;MQ=60.00;MQ0=0;SOR=0.693 GT:AD:DP:GQ:PL 0/0:0,0,0:662:0:0,0,0,1992,1992,29179

  • Hi, my question is pretty simple, I have a batch of sites (may be multiple allele in one site), I would like to genotype all this sites, output all alleles' depth and genotype quality and genotypes. Which tools can do this in GATK?
    As I know, HC may miss parts of the sites, UG can call all sites, but in a multiple allele site, the GQ of all allele is zero.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin


    It looks like the tool is not confident in any of the reads supporting the alleles. Notice the DP of 662, but ADs of 0. Have a look at this article.

    We recommend HaplotypeCaller for germline variant calling. I am not sure what you mean by "miss parts of the sites". In some cases, when the tool is not confident in a variant call, it will not be emitted. If you need to emit low quality sites, you can lower the --standard_min_confidence_threshold_for_calling.


Sign In or Register to comment.