Errors on the road to GATK
I am using the best practices RNA-Seq pipeline for 6 libraries. Four have completed without any problem. Two (from the same project) have gotten snagged. The errors occur at "add or replace read groups" and at "mark duplicates." The errors:
Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Read name HWI-D00273:94:C6GFHANXX:8:1312:12804:32959, CIGAR M operator maps off end of reference
Exception in thread "main" net.sf.samtools.SAMFormatException: Did not inflate expected amount
I know picard tools is not part of GATK, but wondered if anyone has thoughts about what's going on. I have tried starting from scratch with trimmed reads, running cleansam, checking that all pairs are intact...nothing helps. I'm especially puzzled that the other libraries have no issues.