The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

BAQ tag error

Gaurav1983Gaurav1983 Member Posts: 12

Hi,

I ran print reads command with -baq option as recalculate. But while running UnifiedGenotyper, i am getting following error:

"BAQ tag error: the BAQ value is larger than the base quality"

I am running UnifiedGenotyper with -baq option as CALCULATE_AS_NECESSARY.

Can you please let me know the reason for above error.

Regards

Gaurav

Comments

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,413 admin

    What version of GATK are you using?

    Geraldine Van der Auwera, PhD

  • Gaurav1983Gaurav1983 Member Posts: 12

    I am using GenomeAnalysisTK-2.3-9-ge5ebf34 version.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,413 admin

    Can you please upgrade to the latest version (2.4) and try again? Also, before you do that you should validate your bam file (using Picard tools) to make sure there's nothing wrong with it.

    Geraldine Van der Auwera, PhD

  • Gaurav1983Gaurav1983 Member Posts: 12

    I was looking at a bug "Fixing BQSR/BAQ bug". Could this be the problem for above error. Further, Can you please elaborate on the fix?
    (https://github.com/broadgsa/gatk/commit/46b6d3214381e80ba7ee0f5df6432511a995216c)

  • Gaurav1983Gaurav1983 Member Posts: 12

    I am getting above error with latest version of GATK too

  • Gaurav1983Gaurav1983 Member Posts: 12

    Further to narrow down the problem, the intermediate bam file constructed (after alignment around indels) before running PrintReads is running fine with Unified Genotyper. I am using following command while running PrintReads:

    java -Xmx4g -Djava.io.tmpdir=temp -jar GenomeAnalysisTK-2.4-3-g2a7af43/GenomeAnalysisTK.jar -T PrintReads -baq RECALCULATE -BQSR $recal -o $o -R Human_Genome/ucsc.hg19.fasta -I $in 2>>$tid 1>>$tid

    Hope this helps.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,413 admin

    I see -- that looks like a bug. Can you upload a snippet of your bam file so that we can reproduce the error locally? Instructions here:

    http://www.broadinstitute.org/gatk/guide/article?id=1894

    Geraldine Van der Auwera, PhD

  • Gaurav1983Gaurav1983 Member Posts: 12

    folder uploaded in ftp server (BAQ_tag_error.tar.gz). Hope this will help.

  • Johan_DahlbergJohan_Dahlberg Member Posts: 96 ✭✭✭

    Just pitching in that I'm seeing the same problem. I would of course be very happy to see a solution to the problem, but if this will not go straight into the repository I would also be very happy to know which commit caused this so that I can temporarily revert to a earlier state.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,413 admin

    Thanks @Gaurav1983, I'll let you know when we have a fix.

    Johan, unfortunately this was probably introduced during the transition to 2.4, and that represents so many commits that it would be difficult to track down. I expect it will be easier to just go ahead and fix quickly... And as soon as we get the fix we'll patch the repo, so the wait should be minimal.

    Geraldine Van der Auwera, PhD

  • ebanksebanks Broad InstituteMember, Broadie, Dev Posts: 692 admin

    Okay, I know what the problem is and can offer a non-programmatic solution until we figure out how best to handle it directly in the GATK. You are using both -baq and -BQSR and the BAQ'ing is occurring first in the engine (on the original bases), whereas it should occur after recalibration with the BQSR. For now, I'd suggest you use only -BQSR with your PrintReads step and then use -BAQ in your UnifiedGenotyper step - that should solve the issue for now. Please let me know if that works.

    Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

  • ebanksebanks Broad InstituteMember, Broadie, Dev Posts: 692 admin

    I think we do have a programmatic fix worked out too for this, but it's slightly complex so won't be available until the next release (2.5).

    Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

  • Gaurav1983Gaurav1983 Member Posts: 12

    Thanks for quick fix. I have include one more step of print read in my pipeline and It is working fine on test data.

Sign In or Register to comment.