The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ` ) each to make a code block as demonstrated here.

GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

# VariantRecalibrator across multiple VCFs with identical positions but different annotations

United KingdomMember
edited April 2013

I have a set of VCFs with identical positions in them:

VCF1:
1 10097 . T . 26 . AN=196;DP=1622;MQ=20.06;MQ0=456 GT:DP

VCF2:
1 10097 . T . 21.34 . AN=198;DP=2338;MQ=19.53;MQ0=633 GT:DP

VCF3:
1 10097 . T . 11.70 . AN=240;DP=3957;MQ=19.74;MQ0=1085 GT:DP

VCF4:
1 10097 . T . 15.56 . AN=134;DP=1348;MQ=18.22;MQ0=442 GT:DP

If I use all of them as input for VariantRecalibrator, which annotations will VariantRecalibrator use? Should I instead merge the VCFs with CombineVariants and run VariantAnnotator, before I run VariantRecalibrator?

I'm not sure if the forum is for asking technical questions only or you are allowed to ask for best practices as well. Feel free to delete my question, if it doesn't belong here. Thank you.

Tagged:

I see. Then it depends how you want to proceed with your analysis; if you want the various sample calls for the same sites to be treated together, and have results output in a single VCF, then you have to use CombineVariants to merge them first. However, if you're happy having them be processed as separate variants and have the outputs in separate VCFs, then you can pass in separate files.

No worries, your question is fine. We'll take pretty much anything that is related to GATK, and we're more than happy to clarify the Best Practices if it can help people use the tools correctly.

To actually answer your question -- can you first tell me whether those variants derive from the same data (same sample) or from different ones?

• United KingdomMember
edited April 2013

I should have clarified. The samples in each of the 4 VCFs are unrelated; i.e. they are derived from different BAMs originating from different populations.

All 4 VCFs contain calls at the same positions, because I specified an interval list and used EMIT_ALL_SITES, when calling with UnifiedGenotyper. I called the 4 populations separately thinking that would be the best approach.

I also checked the VariantRecalibrator.java source code briefly, but I couldn't quite find the answer to my question.