The current GATK version is 3.3-0

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# BAQ tag error

Posts: 10Member

Hi,

I ran print reads command with -baq option as recalculate. But while running UnifiedGenotyper, i am getting following error:

"BAQ tag error: the BAQ value is larger than the base quality"

I am running UnifiedGenotyper with -baq option as CALCULATE_AS_NECESSARY.

Can you please let me know the reason for above error.

Regards

Gaurav

Tagged:

What version of GATK are you using?

Geraldine Van der Auwera, PhD

• Posts: 10Member

I am using GenomeAnalysisTK-2.3-9-ge5ebf34 version.

Can you please upgrade to the latest version (2.4) and try again? Also, before you do that you should validate your bam file (using Picard tools) to make sure there's nothing wrong with it.

Geraldine Van der Auwera, PhD

• Posts: 10Member

I was looking at a bug "Fixing BQSR/BAQ bug". Could this be the problem for above error. Further, Can you please elaborate on the fix?

• Posts: 10Member

• Posts: 10Member

Further to narrow down the problem, the intermediate bam file constructed (after alignment around indels) before running PrintReads is running fine with Unified Genotyper. I am using following command while running PrintReads:

java -Xmx4g -Djava.io.tmpdir=temp -jar GenomeAnalysisTK-2.4-3-g2a7af43/GenomeAnalysisTK.jar -T PrintReads -baq RECALCULATE -BQSR $recal -o$o -R Human_Genome/ucsc.hg19.fasta -I $in 2>>$tid 1>>\$tid

Hope this helps.

I see -- that looks like a bug. Can you upload a snippet of your bam file so that we can reproduce the error locally? Instructions here:

Geraldine Van der Auwera, PhD

• Posts: 10Member

folder uploaded in ftp server (BAQ_tag_error.tar.gz). Hope this will help.

• Posts: 85Member ✭✭✭

Just pitching in that I'm seeing the same problem. I would of course be very happy to see a solution to the problem, but if this will not go straight into the repository I would also be very happy to know which commit caused this so that I can temporarily revert to a earlier state.

Thanks @Gaurav1983, I'll let you know when we have a fix.

Johan, unfortunately this was probably introduced during the transition to 2.4, and that represents so many commits that it would be difficult to track down. I expect it will be easier to just go ahead and fix quickly... And as soon as we get the fix we'll patch the repo, so the wait should be minimal.

Geraldine Van der Auwera, PhD

• Posts: 684GATK Developer mod

Okay, I know what the problem is and can offer a non-programmatic solution until we figure out how best to handle it directly in the GATK. You are using both -baq and -BQSR and the BAQ'ing is occurring first in the engine (on the original bases), whereas it should occur after recalibration with the BQSR. For now, I'd suggest you use only -BQSR with your PrintReads step and then use -BAQ in your UnifiedGenotyper step - that should solve the issue for now. Please let me know if that works.

Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

• Posts: 684GATK Developer mod

I think we do have a programmatic fix worked out too for this, but it's slightly complex so won't be available until the next release (2.5).

Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

• Posts: 10Member

Thanks for quick fix. I have include one more step of print read in my pipeline and It is working fine on test data.