The current GATK version is 3.6-0
dmittelmandmittelman Member Posts: 3

When I use EMIT_ALL_CONFIDENT_SITES for SNPs, I get an expected very large list of genotypes regardless if the genotypes vary from the reference. When I use the same command line but I switch the model to Indels, I only get a VCF of variant sites. Is the EMIT_ALL_CONFIDENT_SITES option not compatible with Indel discovery?

I'm grateful for any clarification.

Best Answer


  • dmittelmandmittelman Member Posts: 3

    Eric thanks for the super fast response. Let me just ask, when calling indels, how do I know which regions are callable but simply not variant from the reference?

  • ebanksebanks Broad InstituteMember, Administrator, Broadie, Moderator, Dev Posts: 698 admin

    That's actually (unwittingly) a loaded question around us. I personally believe that this can be inferred from the depth (DP) of a region; although there are enough folks here who disagree with me that it is a problem we do plan on tackling in the future (as discussed in other similar threads, and with no promise as to an ETA).

    Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

  • dmittelmandmittelman Member Posts: 3

    Eric, thanks. This actually makes perfect sense, it is definitely not straightforward.

