The frontline support team will be unavailable to answer questions on April 15th and 17th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!
Filtering out variants with high no-call rates
I have run UnifiedGenotyper followed by application of hard filters as recommended in the GATK best practices on my targeted sequencing data. I've noticed, however, there are several variants with very high no-call rates (>90%) which still passed the variant filtration. I'm pasting below part of the vcf files for two such variants.
I've also noticed that most of high no-call rate variants have very low read depths. I read in other discussions that you don't recommend filtering variants by read depth, but I wonder if there is another filtering criteria you can recommend so that such variants wouldn't pass the filtering step (i.e. more stringent std_call_conf values?)?
I can surely filter out the variants based on their call rate before the downstream applications, but I'm trying to understand the sequencing quality metrics, and GATK's behavior here as to what quality of these variants makes them to get a pass in the filtration.
Thanks a lot,
for these two variants below, genotypes for only 2 and 1 (out of 278) people, respectively, were called:
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 10134 10215
1 11857410 rs7537955 A G 101.85 PASS AC=6;AF=1.00;AN=6;DB;DP=3;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;MLEAC=6;MLEAF=1.00;MQ=45.96;MQ0=0;QD=33.95; GT:AD:DP:GQ:PL ./. ./.
4 156661872 . C A 53.39 PASS AC=2;AF=1.00;AN=2;DP=2;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;MLEAC=2;MLEAF=1.00;MQ=60.00;MQ0=0;QD=26.70; GT:AD:DP:GQ:PL ./. ./.