The current GATK version is 3.5-0

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.

# BAQ tag error

Posts: 10Member

Hi,

I ran print reads command with -baq option as recalculate. But while running UnifiedGenotyper, i am getting following error:

"BAQ tag error: the BAQ value is larger than the base quality"

I am running UnifiedGenotyper with -baq option as CALCULATE_AS_NECESSARY.

Can you please let me know the reason for above error.

Regards

Gaurav

Tagged:

## Comments

• Posts: 9,962Administrator, Dev admin

What version of GATK are you using?

Geraldine Van der Auwera, PhD

• Posts: 10Member

I am using GenomeAnalysisTK-2.3-9-ge5ebf34 version.

• Posts: 9,962Administrator, Dev admin

Can you please upgrade to the latest version (2.4) and try again? Also, before you do that you should validate your bam file (using Picard tools) to make sure there's nothing wrong with it.

Geraldine Van der Auwera, PhD

• Posts: 10Member

I was looking at a bug "Fixing BQSR/BAQ bug". Could this be the problem for above error. Further, Can you please elaborate on the fix?
(https://github.com/broadgsa/gatk/commit/46b6d3214381e80ba7ee0f5df6432511a995216c)

• Posts: 10Member

I am getting above error with latest version of GATK too

• Posts: 10Member

Further to narrow down the problem, the intermediate bam file constructed (after alignment around indels) before running PrintReads is running fine with Unified Genotyper. I am using following command while running PrintReads:

java -Xmx4g -Djava.io.tmpdir=temp -jar GenomeAnalysisTK-2.4-3-g2a7af43/GenomeAnalysisTK.jar -T PrintReads -baq RECALCULATE -BQSR $recal -o$o -R Human_Genome/ucsc.hg19.fasta -I $in 2>>$tid 1>>\$tid

Hope this helps.

• Posts: 9,962Administrator, Dev admin

I see -- that looks like a bug. Can you upload a snippet of your bam file so that we can reproduce the error locally? Instructions here:

http://www.broadinstitute.org/gatk/guide/article?id=1894

Geraldine Van der Auwera, PhD

• Posts: 10Member

folder uploaded in ftp server (BAQ_tag_error.tar.gz). Hope this will help.

• Posts: 94Member ✭✭✭

Just pitching in that I'm seeing the same problem. I would of course be very happy to see a solution to the problem, but if this will not go straight into the repository I would also be very happy to know which commit caused this so that I can temporarily revert to a earlier state.

• Posts: 9,962Administrator, Dev admin

Thanks @Gaurav1983, I'll let you know when we have a fix.

Johan, unfortunately this was probably introduced during the transition to 2.4, and that represents so many commits that it would be difficult to track down. I expect it will be easier to just go ahead and fix quickly... And as soon as we get the fix we'll patch the repo, so the wait should be minimal.

Geraldine Van der Auwera, PhD

• Broad InstitutePosts: 698Member, Administrator, Broadie, Moderator, Dev admin

Okay, I know what the problem is and can offer a non-programmatic solution until we figure out how best to handle it directly in the GATK. You are using both -baq and -BQSR and the BAQ'ing is occurring first in the engine (on the original bases), whereas it should occur after recalibration with the BQSR. For now, I'd suggest you use only -BQSR with your PrintReads step and then use -BAQ in your UnifiedGenotyper step - that should solve the issue for now. Please let me know if that works.

Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

• Broad InstitutePosts: 698Member, Administrator, Broadie, Moderator, Dev admin

I think we do have a programmatic fix worked out too for this, but it's slightly complex so won't be available until the next release (2.5).

Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

• Posts: 10Member

Thanks for quick fix. I have include one more step of print read in my pipeline and It is working fine on test data.

Sign In or Register to comment.