GATK licensing moves to direct-through-Broad model -- read about it on the GATK blog

BAQ tag error

Gaurav1983Gaurav1983 Posts: 10Member

Hi,

I ran print reads command with -baq option as recalculate. But while running UnifiedGenotyper, i am getting following error:

"BAQ tag error: the BAQ value is larger than the base quality"

I am running UnifiedGenotyper with -baq option as CALCULATE_AS_NECESSARY.

Can you please let me know the reason for above error.

Regards

Gaurav

Comments

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,781Administrator, GATK Dev admin

    What version of GATK are you using?

    Geraldine Van der Auwera, PhD

  • Gaurav1983Gaurav1983 Posts: 10Member

    I am using GenomeAnalysisTK-2.3-9-ge5ebf34 version.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,781Administrator, GATK Dev admin

    Can you please upgrade to the latest version (2.4) and try again? Also, before you do that you should validate your bam file (using Picard tools) to make sure there's nothing wrong with it.

    Geraldine Van der Auwera, PhD

  • Gaurav1983Gaurav1983 Posts: 10Member

    I was looking at a bug "Fixing BQSR/BAQ bug". Could this be the problem for above error. Further, Can you please elaborate on the fix?
    (https://github.com/broadgsa/gatk/commit/46b6d3214381e80ba7ee0f5df6432511a995216c)

  • Gaurav1983Gaurav1983 Posts: 10Member

    I am getting above error with latest version of GATK too

  • Gaurav1983Gaurav1983 Posts: 10Member

    Further to narrow down the problem, the intermediate bam file constructed (after alignment around indels) before running PrintReads is running fine with Unified Genotyper. I am using following command while running PrintReads:

    java -Xmx4g -Djava.io.tmpdir=temp -jar GenomeAnalysisTK-2.4-3-g2a7af43/GenomeAnalysisTK.jar -T PrintReads -baq RECALCULATE -BQSR $recal -o $o -R Human_Genome/ucsc.hg19.fasta -I $in 2>>$tid 1>>$tid

    Hope this helps.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,781Administrator, GATK Dev admin

    I see -- that looks like a bug. Can you upload a snippet of your bam file so that we can reproduce the error locally? Instructions here:

    http://www.broadinstitute.org/gatk/guide/article?id=1894

    Geraldine Van der Auwera, PhD

  • Gaurav1983Gaurav1983 Posts: 10Member

    folder uploaded in ftp server (BAQ_tag_error.tar.gz). Hope this will help.

  • Johan_DahlbergJohan_Dahlberg Posts: 89Member ✭✭✭

    Just pitching in that I'm seeing the same problem. I would of course be very happy to see a solution to the problem, but if this will not go straight into the repository I would also be very happy to know which commit caused this so that I can temporarily revert to a earlier state.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,781Administrator, GATK Dev admin

    Thanks @Gaurav1983, I'll let you know when we have a fix.

    Johan, unfortunately this was probably introduced during the transition to 2.4, and that represents so many commits that it would be difficult to track down. I expect it will be easier to just go ahead and fix quickly... And as soon as we get the fix we'll patch the repo, so the wait should be minimal.

    Geraldine Van der Auwera, PhD

  • ebanksebanks Broad InstitutePosts: 684Member, Administrator, GATK Dev, Broadie, Moderator, DSDE Member, GP Member admin

    Okay, I know what the problem is and can offer a non-programmatic solution until we figure out how best to handle it directly in the GATK. You are using both -baq and -BQSR and the BAQ'ing is occurring first in the engine (on the original bases), whereas it should occur after recalibration with the BQSR. For now, I'd suggest you use only -BQSR with your PrintReads step and then use -BAQ in your UnifiedGenotyper step - that should solve the issue for now. Please let me know if that works.

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • ebanksebanks Broad InstitutePosts: 684Member, Administrator, GATK Dev, Broadie, Moderator, DSDE Member, GP Member admin

    I think we do have a programmatic fix worked out too for this, but it's slightly complex so won't be available until the next release (2.5).

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • Gaurav1983Gaurav1983 Posts: 10Member

    Thanks for quick fix. I have include one more step of print read in my pipeline and It is working fine on test data.

Sign In or Register to comment.