We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!
Why do you have to provide -I and -L if GENOTYPE_GIVEN_ALLELES is used: UnifiedGenotyper

Previously I have been running a command like this:
java -jar /path/GenomeAnalysisTK.jar \ -T UnifiedGenotyper \ -R /path/human_g1k_v37.fasta \ -et NO_ET \ -K /path/key \ -out_mode EMIT_ALL_SITES \ --input_file /path/bam \ -L /path/intervals \ -gt_mode GENOTYPE_GIVEN_ALLELES \ --alleles /path/vcf \ --dbsnp /path/dbsnp_135.b37.vcf \ -o /path/my.vcf
But I was reading the documentation again and I read this statement:
GENOTYPE_GIVEN_ALLELES
only the alleles passed in from a VCF rod bound to the -alleles argument will be used for genotyping
Which lead me to believe that there wasn't a need to include the lines:
--input_file /path/bam \
-L /path/intervals \
because it would be redundant. But when I try to run without those line I get back an error message:
Walker requires reads but none were provided.
Can you give an explaination as to why both of those lines AND GENOTYPE_GIVEN_ALLELES would be needed?
Best Answer
-
Geraldine_VdAuwera Cambridge, MA admin
No, GENOTYPE_GIVEN_ALLELES just tells the UG "don't look at all possible alleles, just look at the alleles in this VCF and tell me which of those my samples have".
You still need to input the sample bamfiles that you want the program to genotype (using
-I
) and you can still restrict the analysis to a subset of intervals (using-L
).
Answers
@hintzen, what exactly do you want the program to genotype if not sequence reads?
@Geraldine_VdAuwera I am not sure I understand. I want to sequence reads but doesn't giving GENOTYPE_GIVEN_ALLELES do the same job as -I -L ?
No, GENOTYPE_GIVEN_ALLELES just tells the UG "don't look at all possible alleles, just look at the alleles in this VCF and tell me which of those my samples have".
You still need to input the sample bamfiles that you want the program to genotype (using
-I
) and you can still restrict the analysis to a subset of intervals (using-L
).If I want to genotype given alleles and only look at the sites in the given vcf, do I need to provide mysites.vcf to both the --alleles and -L parameters?
@jlrflores You don't have to, but I think it will be faster, because it doesn't traverse all reads of your bam. Does that make sense?