The Frontline Support team will be offline February 18 for President's Day but will be back February 19th. Thank you for your patience as we get to all of your questions!
Best workflow for VQSR when you eventually want individual sample exome VCFs?
We are running an exome sequencing project where we have between 30 and 50 exomes in total. For optimal variant quality score recalibration we should use as much data as possible in the VariantRecalibrator step. However, for downstream analysis purposes, we want individual exome VCFs, and UnifiedGenotyper has been run individually for each sample. Our plan was to feed all data into VariantRecalibrator, and then run ApplyRecalibration on the individual raw VCFs. But VariantRecalibrator takes only one VCF as input, right? So what would be the best workflow for this scenario? Could we run UnifiedGenotyper to create a common VCF for Recalibration purposes only, and then apply this to the individual VCFs? Or would this somehow create an invalid input for the RECAL-file? Is it better to run variant calling and recalibration both on multi-sample VCFs and split the VCFs sample-wise later?