MuTect2 output - number of Variants expected to PASS the test

Hi,

I have just done a Variant Calling using MuTect2 with 10 samples coming from GDC data portal and their respective matched-normal, with a PoN of 12 samples. I'm just a little confused about the number of variant that I have in my samples :

grep -c PASS WXS_GBM01.vcf
293
grep -c PASS WXS_GBM02.vcf
246
grep -c PASS WXS_GBM03.vcf
181
grep -c PASS WXS_GBM04.vcf
146
grep -c PASS WXS_GBM05.vcf
628
grep -c PASS WXS_GBM06.vcf
112
grep -c PASS WXS_GBM07.vcf
206
grep -c PASS WXS_GBM08.vcf
235
grep -c PASS WXS_GBM09.vcf
37375
grep -c PASS WXS_GBM10.vcf
319

In fact I'm wondering how could the GBM09 samples have 100 times more variants than my other samples ? Is that possible ? I just checked that I used the proper samples and it seems correct ...

And what about the average of about 200-300 variants ? I was expecting a higher number since I used BAM files of about 20GB. My variant calling was done with the following command :

java -Xms4000m -Xmx4000m \
-jar GenomeAnalysisTK.jar \
-T MuTect2 \
-R GRCh38.d1.vd1.fa \
-I:tumor $TUMOR \
-I:normal $MATCH_NORMAL \
-PON $PON_FILE \
--dbsnp dbsnp_b147_hg38.vcf \
--cosmic updated_Cosmic38 \
-o $OUT_PFX.vcf \
-nct 8

Thank you very much ! :)

Best Answer

Answers

Sign In or Register to comment.