Holiday Notice:
The Frontline Support team will be offline February 18 for President's Day but will be back February 19th. Thank you for your patience as we get to all of your questions!

MuTect2 output - number of Variants expected to PASS the test

Hi,

I have just done a Variant Calling using MuTect2 with 10 samples coming from GDC data portal and their respective matched-normal, with a PoN of 12 samples. I'm just a little confused about the number of variant that I have in my samples :

grep -c PASS WXS_GBM01.vcf
293
grep -c PASS WXS_GBM02.vcf
246
grep -c PASS WXS_GBM03.vcf
181
grep -c PASS WXS_GBM04.vcf
146
grep -c PASS WXS_GBM05.vcf
628
grep -c PASS WXS_GBM06.vcf
112
grep -c PASS WXS_GBM07.vcf
206
grep -c PASS WXS_GBM08.vcf
235
grep -c PASS WXS_GBM09.vcf
37375
grep -c PASS WXS_GBM10.vcf
319

In fact I'm wondering how could the GBM09 samples have 100 times more variants than my other samples ? Is that possible ? I just checked that I used the proper samples and it seems correct ...

And what about the average of about 200-300 variants ? I was expecting a higher number since I used BAM files of about 20GB. My variant calling was done with the following command :

java -Xms4000m -Xmx4000m \
-jar GenomeAnalysisTK.jar \
-T MuTect2 \
-R GRCh38.d1.vd1.fa \
-I:tumor $TUMOR \
-I:normal $MATCH_NORMAL \
-PON $PON_FILE \
--dbsnp dbsnp_b147_hg38.vcf \
--cosmic updated_Cosmic38 \
-o $OUT_PFX.vcf \
-nct 8

Thank you very much ! :)

Best Answer

Answers

Sign In or Register to comment.