We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

MuTec without dbSNP

Hi,
congratulations for the nice MuTec publication!

I want to use MuTec for contrastive SNP calling in species without existing high quality SNP db (only for some chromosomes partially available). Is it possible to run it without it and which parameter (e.g. --dbsnp_normal_lod, ...) must be adjusted? Any recommendations highly appreciated!

Thanks a lot,
Thomas

P.S.: Is GATK SNP recalibration needed? Of course it would be better but again dbSNP missing.
P.P.S.: Alternatively, I could try to generate a SNP db myself. However the sample number limits the quality of it.

Answers

  • kcibulkcibul Cambridge, MAMember, Broadie, Dev ✭✭✭

    Hi -- thanks for the kind words!

    There is no problem running without a dbsnp track. If you do not provide a dbsnp vcf file then effectively there is no prior for a site being a germline variant. Typically at the Broad when we are running on organisms without a high quality snp database, we do exactly that. Although you are at slightly higher risk for making mistakes at sites of true germline variation, it's still quite a small error rate especially at depths > 20x

    GATK BQSR is recommended as the quality scores will be more accurate, but if that's not possible and your data is of high quality it will not have a huge impact

  • vyellapavyellapa Member

    Hello,
    Im curious if the dbSNP minor allele frequency is used in someway for setting the genotype priors. If not, would it be better, to filter dbSNP vcf for SNPs with MAF less than some arbitrary threshold?

  • kcibulkcibul Cambridge, MAMember, Broadie, Dev ✭✭✭

    We currently do not threshold on dbSNP minor allele frequency, although incorporating that into the prior calculation would make for a more accurate model.

Sign In or Register to comment.