To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Low base quality score after base recalibration...

Hi,

I have just run the base recalibration following GATK best practice. As I'm working on a non-model organism, I had to run a first round of haplotype caller and use the resulting variants (after filtration) to do the base recalibration as recommended by GATK best practrices.

Everything seems ok, the pipeline could be executed on my data without errors. However, when I checked for convergence after the base recalibrations (I ran a second round of BaseRecalibrator and then generated plots using AnalyzeCovariates), the reported base quality after the recalibration became so low... I had most of my bases with quality score higher than 20 but after the recalibration most of them became so low under 10 !
You can see in the attached file the plots generated by AnalyzeCovariates. The reported Q score after recalibration for the substitution is so low....

How could this happen?
Does it just mean that I haven't yet reached the convergence and just need to conducts other rounds of recalibration?
Could this be due to the data?
The used variants may not be filtered with enough stringency and this results in messing up?

I would like to have your regards on this issue.

Cheers.imageimage

Comments

Sign In or Register to comment.