Our staff will be observing the holiday and will be unavailable from 22nd to 25th November. This will cause a delay in reaching out to you and answering your questions immediately. Rest assured we will get back to it on Monday November 26th. We are grateful for your support and patience.
Have a great holiday everyone!!!
Allele frequency and depth VCF produced by MuTect2
From my understanding of the VCF output, the AF[format] field (Allele fraction of the event in the tumor) equals to :
AD[format] / DP[format].
With AD being the depth of coverage of each allele per sample (we use the alt allele when calculating AF),
and DP being the "filtered" depth of coverage for each sample (we use the one computed from the tumor sample when calculating AF).
And with some further reading, I think I figured that :
AD[format] <=> all sample-reads minus uninformative reads.
AD is computed with GATK DepthPerAlleleBySample.
DP[format] <=> all sample-reads minus filtered reads (which is != from uninformative reads).
DP[info] <=> all site-levels-reads (T+N samples), minus nothing.
DP is computed with GATK Coverage
From the GATK doc (http://gatkforums.broadinstitute.org/gatk/discussion/4721/using-depth-of-coverage-metrics-for-variant-evaluation), one can read the following :
The key difference is that the AD metric is based on unfiltered read counts while the sample-level DP is based on filtered read counts (see tool documentation for a list of read filters that are applied by default for each tool). As a result, they should be interpreted differently.
If AF is indeed AD[format]/DP[format], isn't it strange to computed AF by dividing an unfiltered-read depth by a filtered-read depth ?
Ps : I tried to "verify" the DP[info] depth (computed inside the MuTect2 run), by using GATK DepthOfCoverage with the same input (non-marked_recalibrated T/N BAMs). For a given position, I find a higher depth with GATK DepthOfCoverage.(501 vs 434). Is the DP[info] really based on unfiltered-reads ? Or do GATK Coverage & GATK DepthOfCoverage have some minor differences ?