Attention:
The frontline support team will be unavailable to answer questions until May27th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

Mutect2 output vcf file is not giving any QUAL scores

Hi team GATK,
Greetings from India!

I am running GATK4 for somatic variant calling

gatk Mutect2 -R ../../11_reference_genome/ref_gen_chr1_22_X_Y_final.fa --tumor-sample 01T -I 01T_sorted.bam -I 01N_sorted.bam -O out_01_Mutect2.vcf

I am new to GATK and Variant analysis, kindly help me in this:
The output file is showing results as:
1. VCF output from MUTECT2:

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 01N 01T

chrY 353035 . A G . . DP=2;ECNT=1;POP_AF=5.000e-08;TLOD=8.14 GT:AD:AF:DP:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:SA_MAP_AF:SA_POST_PROB ./. 0/1:0,2:1.00:2:0,2:0,0:0,39:0,858:40:3:0.99
chrY 353881 . T G . . DP=2;ECNT=1;POP_AF=5.000e-08;TLOD=7.98 GT:AD:AF:DP:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:SA_MAP_AF:SA_POST_PROB ./. 0/1:0,2:1.00:2:0,2:0,0:0,38:0,858:40:12:0.9
chrY 2648923 . G A . . DP=10;ECNT=2;POP_AF=5.000e-08;TLOD=24.79 GT:AD:AF:DP:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:SA_MAP_AF:SA_POST_PROB ./. 0/1:0,6:0.857:6:0,3:0,3:0,38:0,763:

Q1 : Why QUAL column is giving output as "." for all the calls. None of the variants in 3GB file have quality score.
Q2 : How to predict this file as 01N is giving "./." , is it like keeping 01N as base "0/1:0,2:1.00:2:0,2:0,0:0,39:0,858:40:3:0.99" (chrY first call) are the changes/variations happening in 01T sample?

After running Mutect2, I run FilterMutectCalls function:

gatk FilterMutectCalls -V out_01_Mutect2.vcf -O out_01_Mutect2_filtered.vcf

Output is:

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 019N 019T

chrY 353035 . A G . read_position DP=2;ECNT=1;POP_AF=5.000e-08;P_CONTAM=0.00;P_GERMLINE=NaN;TLOD=8.14 GT:AD:AF:DP:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:SA_MAP_AF:SA_POST_PROB ./. 0/1
chrY 353881 . T G . PASS DP=2;ECNT=1;POP_AF=5.000e-08;P_CONTAM=0.00;P_GERMLINE=NaN;TLOD=7.98 GT:AD:AF:DP:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:SA_MAP_AF:SA_POST_PROB ./. 0/1:0,2:1.0
chrY 2648923 . G A . PASS DP=10;ECNT=2;POP_AF=5.000e-08;P_CONTAM=0.00;P_GERMLINE=-2.406e+00;TLOD=24.79 GT:AD:AF:DP:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:SA_MAP_AF:SA_POST_PROB ./. 0/1
chrY 2648954 . G A . PASS DP=6;ECNT=2;POP_AF=5.000e-08;P_CONTAM=0.00;P_GERMLINE=-1.541e+00;TLOD=11.39 GT:AD:AF:DP:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:SA_MAP_AF:SA_POST_PROB ./. 0/1
chrY 2649246 . C A . PASS DP=2;ECNT=1;POP_AF=5.000e-08;P_CONTAM=0.00;P_GERMLINE=NaN;TLOD=7.98 GT:AD:AF:DP:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:SA_MAP_AF:SA_POST_PROB ./. 0/1:0,2:1.0

It adds "Pass", "str_contraction;t_lod", "bad_haplotype;clustered_events" etc to the Filter column, my question here are:

Q3: There is no QUAL value, how can we say that the variant is PASS or not?
Q4: How to filter potential somatic variant from this huge vcf file, what are the things one should keep in mind while filtering for somatic variants?

I know, there are n-number of research papers online about this, I am just curious about your ideas and many people like me, who are new to the domain, they can get answers under one umbrella.

Till now what I have understood is:
Depth >=30, Filter = PASS, Quality >=30
should be the parameters one should apply to filter variants. Snpshift and snpeff are used for annotations and filtering results, kindly correct me if I am wrong!!
Any help will be highly appreciated!

Thanks
Ajay Katoch

Tagged:

Best Answer

Answers

Sign In or Register to comment.