VariantRecalibrator across multiple VCFs with identical positions but different annotations

tommycarstensen
I have a set of VCFs with identical positions in them:

VCF1: 1 10097 . T . 26 . AN=196;DP=1622;MQ=20.06;MQ0=456 GT:DP

VCF2: 1 10097 . T . 21.34 . AN=198;DP=2338;MQ=19.53;MQ0=633 GT:DP

VCF3: 1 10097 . T . 11.70 . AN=240;DP=3957;MQ=19.74;MQ0=1085 GT:DP

VCF4: 1 10097 . T . 15.56 . AN=134;DP=1348;MQ=18.22;MQ0=442 GT:DP

If I use all of them as input for VariantRecalibrator, which annotations will VariantRecalibrator use? Should I instead merge the VCFs with CombineVariants and run VariantAnnotator, before I run VariantRecalibrator?

I'm not sure if the forum is for asking technical questions only or you are allowed to ask for best practices as well. Feel free to delete my question, if it doesn't belong here. Thank you.

  Geraldine_VdAuwera
    I see. Then it depends how you want to proceed with your analysis; if you want the various sample calls for the same sites to be treated together, and have results output in a single VCF, then you have to use CombineVariants to merge them first. However, if you're happy having them be processed as separate variants and have the outputs in separate VCFs, then you can pass in separate files.

    Geraldine Van der Auwera, PhD


  Geraldine_VdAuwera

    No worries, your question is fine. We'll take pretty much anything that is related to GATK, and we're more than happy to clarify the Best Practices if it can help people use the tools correctly.

    To actually answer your question -- can you first tell me whether those variants derive from the same data (same sample) or from different ones?

    Geraldine Van der Auwera, PhD

  tommycarstensen
    I should have clarified. The samples in each of the 4 VCFs are unrelated; i.e. they are derived from different BAMs originating from different populations.

    All 4 VCFs contain calls at the same positions, because I specified an interval list and used EMIT_ALL_SITES, when calling with UnifiedGenotyper. I called the 4 populations separately thinking that would be the best approach.

    I also checked the VariantRecalibrator.java source code briefly, but I couldn't quite find the answer to my question.

  tommycarstensen

    Thank you Geraldine. I don't want the identical positions processed as separate sites. Hence I am taking the route of using CombineVariants followed by VariantAnnotator. Thank you for confirming my choice/approach to be the right one in this case.

