To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

information about GATK "Filter" field

Paola_OrsiniPaola_Orsini University of BariMember

Dear doctor,
I'm analyzing the vcf files obtained by applying your GATK variant calling tool. I'm new in this kind of analysis, and I would like to kindly ask you some additional information about the filtering step of the VCF file according to the "FIlter" field. In details, I read that variants that are above the defined FILTER VQSLOD threshold , pass the filter, so the FILTER field will contain PASS, while variants that are below the threshold will be filtered out; however, they are written to the output file, but in the filter field they have the name of the tranche they belonged to. I also read about the tranches, corresponding to different levels of sensitivity and accuracy of the variants. In my VCF file, I have 4 groups of variants according to the "Filter" field:

  • LowQual
    -PASS
    -VQSRTrancheINDEL99.00to99.90
    -VQSRTrancheSNP99.00to99.90
    -VQSRTrancheSNP99.90to100.00

I can't fully understand the meaning of the tranches, and the probability of false positives for each one. I would like to kindly ask you how I can interpret the three categories defined as "VQSRTranche", and if you suggest to remove or not these variants for subsequent analyses, and consider only the "PASS" filter.
Thank you in advance for your time and kind attention.
Best regards,

Paola Orsini

Tagged:

Answers

Sign In or Register to comment.