This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
In calling SNPs and INDELS, can GATK call these varints only in relation to the reference genome?
I am working with exon-capture NG sequence data from a non-model species. To date, I have assembled the NG sequences against the reference genome of a distantly related species and fin-tuned my alignments around potential indel sites. I am now ready to identify SNPs (and possibly INDELS). I am, however, only interested in sites that are variable within the species I am working with. I am not interested in sites that vary between the reference genome and the NG sequence data, as the reference genome comes from only a very distantly related species.
Can the GATK identify SNPs without reference to the reference genome?