Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

Error: mismatched read pair insert sizes

akemdeakemde NYCPosts: 4Member

Hi Bob,

thanks for your email. Here is the error message I've been getting when running GenomeSTRiP on 50 genomes (bwa-aligned, GATK indel realigned and quality recalibrated):

ERROR MESSAGE: Mismatched read pair insert sizes for sample 6837: [ {HS2000-910_287:1:1309:11124:98404 97 17 22252630 0 100M = 22251548 -983 CTTTGAAGATTTCGTTGGAAACGGGATAATCTTCACAGAAAAGCTAAACAGAAGCATTCTCAGAAACTTCTTTGTGATGTTTGCTTTCAACTCACAGAGT >@??>>?=>=??=6=?==???<6==>>>?>=??=>;><=>??==?>??<>=>9==>=?=><=>>>?;>><>@?>=?>=<>>?><=?><=5;?=><>>>>= X0:i:5 X1:i:0 XA:Z:17,+22255009,100M,0;17,+22259766,100M,0;17,+22247875,100M,0;17,+22245496,100M,0; BD:Z:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN MD:Z:100 RG:Z:6837 XG:i:0 BI:Z:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:47 XT:A:R}, {HS2000-910_287:1:1309:11124:98404 145 17 22251548 47 2M1D98M = 22252630 981 TTTGAGAGAGAAGCTTTGAAACACTCTTTTTCTAGAATCTGCAAGTGGACATTGGGAGGGCTGTGAGGTTTGTGGTGGAAAAGGAAATATCTCCACATAA @@>=?=?=?=>?<>=>?>?==>;=><>>>>><=><?=><=<=?>;======>>==<>====>;>=?=;??>;>=;===????==??=>=>=>>=<===@7 X0:i:1 X1:i:0 OC:Z:100M BD:Z:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN RG:Z:6837 XG:i:0 BI:Z:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN AM:i:0 NM:i:4 SM:i:37 XM:i:4 XO:i:0 OP:i:22251549 XT:A:U} ]
ERROR ------------------------------------------------------------------------------------------

So it complains that the insert size of the left and right mate don't agree. It only happens very rarely - on 50 high-coverage whole genomes the error was reported 4 times. Is there a way to prevent GenomeSTRiP from crashing at those instances, e.g. printing a warning instead of an error?
Also, if you have seen this error before, do you happen to know whether the mismatch in insert size might be something that is introduced during GATK realignment?

Thanks a lot!

Anne-Katrin

Tagged:

Best Answer

Answers

Sign In or Register to comment.