It looks like you're new here. If you want to get involved, click one of these buttons!
Hi
We have 100 samples run through the GATK unified genotyper and then we merged all the VCF files to run the multi samples VQSR. (merged was done using VCFTOOLS). What attributes we should use in this case.
For multi sample called vcf we use these paramters:
-an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ -an DP -nt 2 --maxGaussians 4 --percentBadVariants 0.05
any help is deeply appreciated.
Thanks
Saurabh
ebanks
Posts: 475 mod
Hi Saurabh,
You shouldn't run VQSR on the merged files - it needs to be run on the original 100-sample batches. Note however that merging filtered files is an extremely difficult problem. It was actually one of the biggest motivators for our creating the Reduce Reads tool (so that we no longer needed to run in batches anymore).