The current GATK version is 3.7-0
Error: mismatched read pair insert sizes

akemdeakemde NYCMember Posts: 6

Hi Bob,

thanks for your email. Here is the error message I've been getting when running GenomeSTRiP on 50 genomes (bwa-aligned, GATK indel realigned and quality recalibrated):

ERROR MESSAGE: Mismatched read pair insert sizes for sample 6837: [ {HS2000-910_287:1:1309:11124:98404 97 17 22252630 0 100M = 22251548 -983 CTTTGAAGATTTCGTTGGAAACGGGATAATCTTCACAGAAAAGCTAAACAGAAGCATTCTCAGAAACTTCTTTGTGATGTTTGCTTTCAACTCACAGAGT >@??>>?=>=??=6=?==???<6==>>>?>=??=>;><=>??==?>??<>=>9==>=?=><=>>>?;>><>@>;=?=?=?=>?<>=>?>?==>;=><>>>>><=><?=><=<=?>;======>>==<>====>;>=?=;??>;>=;===????==??=>=>=>>=<===@7 X0:i:1 X1:i:0 OC:Z:100M BD:Z:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN RG:Z:6837 XG:i:0 BI:Z:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN AM:i:0 NM:i:4 SM:i:37 XM:i:4 XO:i:0 OP:i:22251549 XT:A:U} ]
ERROR ------------------------------------------------------------------------------------------

So it complains that the insert size of the left and right mate don't agree. It only happens very rarely - on 50 high-coverage whole genomes the error was reported 4 times. Is there a way to prevent GenomeSTRiP from crashing at those instances, e.g. printing a warning instead of an error?
Also, if you have seen this error before, do you happen to know whether the mismatch in insert size might be something that is introduced during GATK realignment?

Thanks a lot!



Best Answer


  • beruttiberutti Member Posts: 2

    I have the same problem. I'm using svtoolkit 1.03.619 and it seems that the suggested solution has no effect. Any suggestion?

  • bhandsakerbhandsaker Member, Collaborator Posts: 378 ✭✭✭

    You need to move up to the latest interim release.

    The later releases are much better than the old 1.03 release.

    Bob Handsaker, Broad Institute / Harvard Medical School Dept of Genetics

