VQSR or hard filters

A.IrisA.Iris UppsalaMember

Hi GATK team!
I was reading through the VQSR documentation. At some point it is stated that "Whole exome call sets work well, but anything smaller than that scale might run into difficulties." We used VQSR successfully for a 32Mb array and now I have an array of 20Mb.
I was wondering if this time the array is too small for the model. Shall I use hard filters instead?

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @A.Iris
    Hi,

    We recommend using either 30 whole exome samples or 1 whole genome in VQSR. How many samples do you have in your dataset?

    -Sheila

  • A.IrisA.Iris UppsalaMember

    Hi,
    Thanks for your response.
    I have in total 315 samples (case+controls).

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @A.Iris
    Hi,

    You should be able to use VQSR with 315 samples.

    Good luck!

    -Sheila

  • pdexheimerpdexheimer Member ✭✭✭✭

    @A.Iris -

    It's all about getting overlap with your training sites. I would think that 20MB would be large enough, though it's in the area where I would start getting concerned. But the more important metric is the total number of variants, and how well they overlap - it would certainly be possible to come up with 20MB of capture that miss the training sites completely (iirc, they're concentrated in coding regions). Similarly, you could have whole exomes but not have enough variation in your data to get sufficient overlap - that's the basis of the '30 individuals' guideline Sheila mentioned.

    In the end, all you can do is try. My intuition is that 20MB x 315 people should be fine, but it's definitely worth checking the output plots and running VariantEval over the result to make sure everything looks good.

  • A.IrisA.Iris UppsalaMember

    Thank you both for your responses!
    I will try to run it and check what happens. I am not sure if I could interpret the output plots but I would give it a go.

Sign In or Register to comment.