The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ` ) each to make a code block as demonstrated here.

GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

# VQSR on single exome

Member Posts: 266 ✭✭

hi, Geraldine,
Thanks for the webinar! You mentioned that VQSR isn't necessary for a single exome. But would there be any drawback to run it on a single exome? I see that it helps to set up the PASS filter.

Tagged:

Ah, glad to hear you caught the webinar.

It's not that it's not necessary, it's that VQSR won't work properly on a single exome, because it won't have enough data. The random forests implementation will help with that. though it remains to be seen how small it will go.

Geraldine Van der Auwera, PhD

• Member Posts: 266 ✭✭

So do you mean running VQSR on a single exome would lower the calling quality? I did that for some single exomes, should I discard those and use the vcf files right after GenotypeGVCFs? Could you recommend the common criteria for hard filtering?

• Member Posts: 266 ✭✭

Sure. The idea here is that typically, a single exome doesn't have enough variants to fully empower the model training. It'll run, but results may not be as good as they could be. Our recommendation for dealing with exomes, if you don't have a large cohort, is to include other exomes in your analysis. For example, you can get exomes from the 1000 Genomes project that match your samples (we try to match them up by ethnicity) to beef up your cohort, up to 30 samples. Or you can group whatever exomes you have in hand even if they're not part of the same project. It's better to do that than try to hard filter your variants.

We're hoping that the new implementation (coming out in 3.2) will bypass this requirement to a large extent.

Does this clarify things a little?

Geraldine Van der Auwera, PhD

• Member Posts: 266 ✭✭
edited April 2014

hi, Geraldine,
Sorry for responding late! I was occupied by a two-day meeting.
Thanks for the clarification! I'd like to confirm that the VQSR only changes the variant quality scores but not the callings themselves, am I right? I mean, it won't change a variant A to T, or remove or add variants.

No worries -- that's correct, just keep in mind that the second step does change the FILTER field.

Geraldine Van der Auwera, PhD