Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Caller input_prior option
I'm using HaplotypeCaller (but it could be also possible to use this option with UnifiedGenotyper) for a very special experimental design in a no-human species, where we have an expectation for the prior probabilities of each genotype. I'm planning to call SNPs for single diploid individuals using HaplotypeCaller and afterwards for the whole dataset with GenotypeGVCFs.
Nevertheless, I'm confused about the structure of the prior probabilities command line. In the documentation, it says: "Values will be added and Pr(AC=0) will be 1-sum, so that they sum up to one. f) If user-defined values add to more than one, an error will be produced". So I'll require to provide two prior probabilities out of the 3 for each genotype (0/0, 0/1 and 1/1). My first guess is that the prior that I don't need to provide is for the reference homozygous (0/0) based on the Pr(AC=0) specified in the documentation. I would like to know if this idea is correct.
My second problem if is the two input_prior options are positional parameters. If so, and if my first guess for the Pr(AC=0) is correct, do they represent the probability of 0/1 and 1/1, that is, Pr(AC=1) and Pr(AC=2)?
More concretely, I'm going to provide an example where you don't expect any heterozygous call. In that case, is it correct that the argument will be --input_prior 0.5 --input_prior 0?
Thank you very much.