The current GATK version is 3.6-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# Run time error during variant calling

Member Posts: 3
edited March 2013

Hi, I'm using GATK latest version to analyze paired end exome sequencing data. I'd like to see the SNP, Indel and also SVs. I have followed the workflow of GATK, from the duplicates marking to the reads reducing step. Everything goes fine, until I start to use the HaplogypeCaller walker for the variant calling.
Command line I used:

java -jar $GATK/GenomeAnalysisTK.jar -T HaplotypeCaller -R human_g1k_v37.fa -I sample_reduced.bam -o sample_variant.vcf  At the beginning, it worked well, then I got the error message of "Reads are too small for use in assembly." And I also tried the UnifiedGenotyper walker, command line： java -jar$GATK/GenomeAnalysisTK.jar -T UnifiedGenotyper -R  human_g1k_v37.fa -I sample_reduced.bam -glm BOTH -o sample_variant.vcf


I got an error message of "Read bases and read insertion quals aren't the same, size 46 vs. 49".
I have googled the error message, but no related result. Does anyone met with the same problem? Eager to know how to solve this.
Thanks!

Tagged:

Your first error is a bug we are aware of and are working to fix. The second sounds like a bam file issue -- have you tried validating your bam?

Geraldine Van der Auwera, PhD

• Member Posts: 3

Thank you for the quick response.
I tried to call SNP and Indel separately, and the SNP calling works. I'm now waiting for the result of Indel calling.
java -jar $GATK/GenomeAnalysisTK.jar -T UnifiedGenotyper -R human_g1k_v37.fa -I sample_reduced.bam -o sample_variant.vcf java -jar$GATK/GenomeAnalysisTK.jar -T UnifiedGenotyper -R human_g1k_v37.fa -I sample_reduced.bam -glm INDEL -o sample_variant.vcf
To your question, Yes, I have followed the suggestions from the community to validate the bam file in every step using picard ValidateSamFile. The sam file is ok, but there is a warning "NM tag in the file does not match the reallity" after being cleaned using picard CleanSam, the conversion from sam to bam, using samtools, and fixing the mates.

OK, sounds like you're doing all the right things. The NM tag warning is probably not worth worrying about.

Let me know if the second error persists, and if so we'll look into it.

Geraldine Van der Auwera, PhD

• Member Posts: 3

The second error persists.

• Member Posts: 2

I've started to receive that second "Read bases and read insertion quals aren't the same" on a set of my BAMs as well using the 2.4 release

• Member Posts: 2

I should add perhaps that my BAMs were processed along with the best practices: (align, dupe marking, indel realign, bqsr, reduce reads all with 2.4). Using the UnifiedGenotyper then gives the bases & quals size error. I'm trying now calling the same BAMs with 2.3-4-g57ea19f and it seems to be running without error

• Member Posts: 6

I too encouuntered the second error, i.e., "Read bases and read insertion quals aren't the same, size A vs. B"; it happened only when I used latest GATK's UnifiedGenotyper with -glm INDEL or -glm BOTH; earlier versions didn't produce this error.
After some experimenting I found out that it disappears when I use -DIQ option with PrintReads during the BaseRecalibration step.

OK folks, it looks like we have a bug here. Can one of you upload a snippet of your bam file so that we can reproduce the error locally? Please see instructions here:

Geraldine Van der Auwera, PhD

• Member Posts: 6

bugreport_a1ef69e011.tar.gz