If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
VQSR with missing annotation fields
I am calling variants (non-model organism) following the best practice workflow. After haplotypecaller (with GVCF) and GenotypeGVCFs, I want to perform VQSR (separately for SNPs and INDELs) to the raw vcf file (10 samples) as an alternative for filtering manually. As a resource I use a subset of very high quality variants (obtained with hard filtering, coming from different samples). However, I have noticed that not all annotation fields are present in the raw vcf file. For example, not all sites have a MQRankSum field, ReadPosRankSum field or MQ field. As I want to use these annotations for recalibration, I was wondering how the model handles such sites (I have been using GATK3.5 so far) and how this would affect the filter field.
I guess running variantAnnotator on the raw vcf file will not add the missing fields (for example ReadPosRankSum cannot be calculated when no reads are found with the reference allele).
Thanks a lot for your help!