Seeking help debugging a VQSR tranche plot
Dear GATK Team,
I have received data from ~ 35k whole exome sequences from a collaborator and while proceeding with variant filtering I noticed that the VQSR tranche plot looked abnormal.
Contrary to the tranche plots I usually see with the ti/tv declining with added false positives at tranche levels, I see the ti/tv rising sharply at more permissive filtering levels (plot attached).
I thought this might have to do with some degraded samples with high singletons, so I performed a singleton QC and removed any samples with singletons exceeding a median absolute deviation exceeding 3, which set the threshold at ~ 62 singletons. This removed about 4,500 samples.
I re-did VQSR on the remaining 30.5k samples but the problem seems to persist
and I'm out of ideas on how to resolve this. I'd appreciate any tips on how to debug the issue.