SNP chip data base for Base Quality Score Recalibration
I want to perform the Base Quality Score Recalibration for maize data, I am deciding how to obtain a:
A database of known polymorphic sites to mask out
One of the possibilities that I am exploring is to use positions from a SNP chip, because I know that the positions from the chip came from high quality SNPs, these are around 600K positions. Do you think that could be a good idea to use this as my database? I expect to abtain around 20 million of SNPs from my calling, so I am wondering if these data base is not to small.