VQSR on specific genomic region

Dear GATK Team,

I have exome-data of many individuals (>2000) called with the HaplotypeCaller, but only of a specific set of genes from the genome. I would like to apply the VQSR-tool to recalibrate my variants, but (as expected) I get back an error 'No data found'.
I know there is an option to 'pad' your data with other exomes, but then the generation method needs to be comparable to my dataset (which whole-exome-sequencing is not).
Alternatively, I was therefore wondering if there is an option to 'focus' the VQSR-tool only on specific regions of the genome/exome?
Because I know for sure that if only my regions would be considered in the recalibration I would have enough variants to create a recalibration model.

Thank you for your help in advance,
Kind regards,

rosannevd

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @rosannevd
    Hi rosannevd,

    I am confused. Do you have exome data and are only interested in specific genes? Or, do you only have data from specific genes? If you only have data from specific genes, you cannot pad with other samples. The tool needs many different sites to make a good model. You will need to hard filter.

    -Sheila

Sign In or Register to comment.