We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Inconsistent handling of high base quality scores

brofallonbrofallon Salt Lake City, UTMember

It seems as if the handling of sam/bam/cram files with higher-than-usual base quality scores is inconsistent. In particular, a base quality score of 93 causes HaplotypeCaller to throw an exception if a bed file is supplied, but not if the target region is expressed in chr:start-end format. For instance, for the same bam file, invoking HaplotypeCaller like this:

-T HaplotypeCaller -R my-ref.fa -I alignment.bam -L my-bed-file.bed

returns an error regarding higher-than-expected base quality scores, but

-T HaplotypeCaller -R my-ref.fa -I alignment.bam -L chr1:500-1000

works just fine, even though the region in the bed file is identical to the region given as a command line arg.

Confusing and a little frustrating.



  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    This might be a side-effect of downsampling. When you use a different intervals list (or the same interval coordinates provided in a 0-indexed format vs a 1-indexed format) the random seed for downsampling is different, so it is possible that one of your runs is using the offending read and the other is not, if there is a lot of depth at that position. If the read is not being used, its bases won't be loaded and checked, hence the error won't be thrown.

    In any case this should already have been caught in pre-processing if you followed the Best Practices recommendations.

Sign In or Register to comment.