Hi GATK Users,

Happy Thanksgiving!
Our staff will be observing the holiday and will be unavailable from 22nd to 25th November. This will cause a delay in reaching out to you and answering your questions immediately. Rest assured we will get back to it on Monday November 26th. We are grateful for your support and patience.
Have a great holiday everyone!!!

Regards
GATK Staff

VQSR on specific genomic region

Dear GATK Team,

I have exome-data of many individuals (>2000) called with the HaplotypeCaller, but only of a specific set of genes from the genome. I would like to apply the VQSR-tool to recalibrate my variants, but (as expected) I get back an error 'No data found'.
I know there is an option to 'pad' your data with other exomes, but then the generation method needs to be comparable to my dataset (which whole-exome-sequencing is not).
Alternatively, I was therefore wondering if there is an option to 'focus' the VQSR-tool only on specific regions of the genome/exome?
Because I know for sure that if only my regions would be considered in the recalibration I would have enough variants to create a recalibration model.

Thank you for your help in advance,
Kind regards,

rosannevd

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @rosannevd
    Hi rosannevd,

    I am confused. Do you have exome data and are only interested in specific genes? Or, do you only have data from specific genes? If you only have data from specific genes, you cannot pad with other samples. The tool needs many different sites to make a good model. You will need to hard filter.

    -Sheila

Sign In or Register to comment.