To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Filtering out variants with high no-call rates

Hi,
I have run UnifiedGenotyper followed by application of hard filters as recommended in the GATK best practices on my targeted sequencing data. I've noticed, however, there are several variants with very high no-call rates (>90%) which still passed the variant filtration. I'm pasting below part of the vcf files for two such variants.

I've also noticed that most of high no-call rate variants have very low read depths. I read in other discussions that you don't recommend filtering variants by read depth, but I wonder if there is another filtering criteria you can recommend so that such variants wouldn't pass the filtering step (i.e. more stringent std_call_conf values?)?

I can surely filter out the variants based on their call rate before the downstream applications, but I'm trying to understand the sequencing quality metrics, and GATK's behavior here as to what quality of these variants makes them to get a pass in the filtration.

Thanks a lot,

Gulum

for these two variants below, genotypes for only 2 and 1 (out of 278) people, respectively, were called:

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 10134 10215

1 11857410 rs7537955 A G 101.85 PASS AC=6;AF=1.00;AN=6;DB;DP=3;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;MLEAC=6;MLEAF=1.00;MQ=45.96;MQ0=0;QD=33.95; GT:AD:DP:GQ:PL ./. ./.
4 156661872 . C A 53.39 PASS AC=2;AF=1.00;AN=2;DP=2;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;MLEAC=2;MLEAF=1.00;MQ=60.00;MQ0=0;QD=26.70; GT:AD:DP:GQ:PL ./. ./.

Best Answer

Answers

Sign In or Register to comment.