Options -S LENIENT --fix_misencoded_quality_scores in RealignerTargetCreator and IndelRealigner

I got errors when ran GATK RealignerTargetCreator and IndelRealigner in v2.4.9. I've checked many related discussions and comments. First, I got an error like "we encountered an extremely high quality score of 69" with option -S LENIENT and the GATK program stalled. So I added "--fix_misencoded_quality_scores", and then I got different error message "ERROR MESSAGE: Bad input: We encountered a non-standard non-IUPAC base in the provided reference: '0'" now. I tried older versions of GATK and both java 1.6 and 1.7. I'm hoping that you can help this. Please let me know if I'm missing something. Thanks!


  • gsonggsong Member

    Thanks for your note. I'm using sacCer3 in the UCSC genome browser in fasta format. It's Yeast reference genome and the file is not compressed.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    We've seen similar errors when people run GATK on Windows. Any chance you're on a Windows platform, or your reference file comes from a Windows filesystem?

  • gsonggsong Member

    I'm running on a linux machine with kernel version 2.6.32-279.5.1.el6.x86_64. I downloaded the reference file from the UCSC genome browser and it is a typical text file in FASTA. When I open it using 'vi', it looks fine.

    When I ran an older version of GATK (2.1.8) with other data with no --fix_misencoded_quality_scores using the same reference file, I got results successfully.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    OK, we'll look into this. In the meantime you can use the -XL argument to skip any positions where this occurs. Let me know if it happens at more than one.

