This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Mutect2 with contamination estimates
How many sites does ContEst need to get an accurate answer?
A couple of my samples give me results like this:
name population population_fit contamination confidence_interval_95_width confidence_interval_95_low confidence_interval_95_high sites
META CEU n/a 57.3 0.8 56.9 57.7 83
57% contamination seems very high. Other samples report using around 1000 sites and the contamination comes out around 20%. I wonder if the high result is inaccurate as ConTest is only using 83 sites?
How does mutect2 use the output from ContEst?. I would to like to run Mutect2 with and without the ConTest results, as I am concerned I will get very few SNPs passing if such a high level of contamination is assumed . However Mutect2 is running very slowly and I don't have the compute resources to run it twice. Is there any way I can filter the output of muctect2 to take into account the contamination estimates?
Any thoughts much appreciated