If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
information about GATK "Filter" field
I'm analyzing the vcf files obtained by applying your GATK variant calling tool. I'm new in this kind of analysis, and I would like to kindly ask you some additional information about the filtering step of the VCF file according to the "FIlter" field. In details, I read that variants that are above the defined FILTER VQSLOD threshold , pass the filter, so the FILTER field will contain PASS, while variants that are below the threshold will be filtered out; however, they are written to the output file, but in the filter field they have the name of the tranche they belonged to. I also read about the tranches, corresponding to different levels of sensitivity and accuracy of the variants. In my VCF file, I have 4 groups of variants according to the "Filter" field:
I can't fully understand the meaning of the tranches, and the probability of false positives for each one. I would like to kindly ask you how I can interpret the three categories defined as "VQSRTranche", and if you suggest to remove or not these variants for subsequent analyses, and consider only the "PASS" filter.
Thank you in advance for your time and kind attention.