Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
MuTect2 downsampling (-dfrac) - numbers don't match?
Hi - I have a pilot normal-tumor paired samples that are sequenced to ~300X. We are now doing downsampling and see what's the minimum coverage we need to capture the SNPs/INDELs found in the original samples. But with -dfrac 0.25, which is supposed to downsample the original samples to 25% depth, gave a higher number of sites - 2659 sites were detected in the original, and 3321 sites with downsampling. Only very few of them overlap.
I also ran the downsampling once again, to get a "replicate" of it. the numbers roughly match but still only some of sites overlap.
What might have caused this discrepancy?
java -jar GATK.jar -R REFERENCE.fa -T MuTect2 -nct 8 -L INTERVAL.bed -I:tumor TUMOR.bam -I:normal NORMAL.bam -o OUTPUT.vcf -gt_mode DISCOVERY -stand_call_conf 10 --heterozygosity 0.00001 -dfrac 0.25