The current GATK version is 3.8-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Download the latest Picard release at https://github.com/broadinstitute/picard/releases.
GATK version 4.beta.3 (i.e. the third beta release) is out. See the GATK4 beta page for download and details.

Run time error during variant calling

KellyKelly Member
edited March 2013 in Ask the GATK team

Hi, I'm using GATK latest version to analyze paired end exome sequencing data. I'd like to see the SNP, Indel and also SVs. I have followed the workflow of GATK, from the duplicates marking to the reads reducing step. Everything goes fine, until I start to use the HaplogypeCaller walker for the variant calling.
Command line I used:

java -jar $GATK/GenomeAnalysisTK.jar -T HaplotypeCaller -R human_g1k_v37.fa -I sample_reduced.bam -o sample_variant.vcf

At the beginning, it worked well, then I got the error message of "Reads are too small for use in assembly."
And I also tried the UnifiedGenotyper walker, command line:

java -jar $GATK/GenomeAnalysisTK.jar -T UnifiedGenotyper -R  human_g1k_v37.fa -I sample_reduced.bam -glm BOTH -o sample_variant.vcf

I got an error message of "Read bases and read insertion quals aren't the same, size 46 vs. 49".
I have googled the error message, but no related result. Does anyone met with the same problem? Eager to know how to solve this.
Thanks!

Tagged:

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Your first error is a bug we are aware of and are working to fix. The second sounds like a bam file issue -- have you tried validating your bam?

  • KellyKelly Member

    Thank you for the quick response.
    I tried to call SNP and Indel separately, and the SNP calling works. I'm now waiting for the result of Indel calling.
    java -jar $GATK/GenomeAnalysisTK.jar -T UnifiedGenotyper -R human_g1k_v37.fa -I sample_reduced.bam -o sample_variant.vcf
    java -jar $GATK/GenomeAnalysisTK.jar -T UnifiedGenotyper -R human_g1k_v37.fa -I sample_reduced.bam -glm INDEL -o sample_variant.vcf
    To your question, Yes, I have followed the suggestions from the community to validate the bam file in every step using picard ValidateSamFile. The sam file is ok, but there is a warning "NM tag in the file does not match the reallity" after being cleaned using picard CleanSam, the conversion from sam to bam, using samtools, and fixing the mates.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    OK, sounds like you're doing all the right things. The NM tag warning is probably not worth worrying about.

    Let me know if the second error persists, and if so we'll look into it.

  • KellyKelly Member

    The second error persists.

  • regreg Member

    I've started to receive that second "Read bases and read insertion quals aren't the same" on a set of my BAMs as well using the 2.4 release

  • regreg Member

    I should add perhaps that my BAMs were processed along with the best practices: (align, dupe marking, indel realign, bqsr, reduce reads all with 2.4). Using the UnifiedGenotyper then gives the bases & quals size error. I'm trying now calling the same BAMs with 2.3-4-g57ea19f and it seems to be running without error

  • I too encouuntered the second error, i.e., "Read bases and read insertion quals aren't the same, size A vs. B"; it happened only when I used latest GATK's UnifiedGenotyper with -glm INDEL or -glm BOTH; earlier versions didn't produce this error.
    After some experimenting I found out that it disappears when I use -DIQ option with PrintReads during the BaseRecalibration step.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Thanks Andrei, that is helpful.

    OK folks, it looks like we have a bug here. Can one of you upload a snippet of your bam file so that we can reproduce the error locally? Please see instructions here:

    http://www.broadinstitute.org/gatk/guide/article?id=1894

  • just uploaded the snippet etc.:

    bugreport_a1ef69e011.tar.gz

  • ebanksebanks Broad InstituteMember, Broadie, Dev

    Thanks for the bug report. I've implemented a patch that should hopefully roll out to the public later today.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    FYI this is fixed as of version 2.4-7.

Sign In or Register to comment.