It looks like you're new here. If you want to get involved, click one of these buttons!
Hello,
Im trying to call variants using UnifiedGenotyper on ca 450 reduced bams in 100000 bp chunks. It works fine for some of the chunks, but for others I get the following error message:
Can anyone explain to me why there is a problem with a specific bam file when I call on for example chunk chr20:25400000-25500000 but not when I call on chunk chr20:10000000-10100000?
Thank you, Tota
Carneiro
Posts: 159 admin
BAMs are block compressed and the index can take you directly to blocks depending on genomic location, so it is possible that some chunks are bad and others are good.
Did you generate this BAM in the GATK?
I'd recommend re-generating the index (remove the current index file and use samtools index file.bam). If that doesn't work, test your original bam (if you have it). If that is not corrupted, then try re-reducing the problematic one, if you get it corrupt again, please send it to us for debugging.
Is this helpful ?
Answers
Thanks for your reply.
I did generate the BAM using GATK -T ReduceReads. When I try to regenerate the index file using "samtools index", I get this message before the program terminates:
$ samtools index reduced.bam bam_index_core] truncated file? Continue anyway. (-4)
I'm surprised by this error message since it looks like the BAM is complete according to its LOG, which ends in these 4 lines:
INFO 02:20:06,640 TraversalEngine - Total runtime 36955.17 secs, 615.92 min, 10.27 hours INFO 02:20:06,640 TraversalEngine - 4801933 reads were filtered out during traversal out of 71057279 total (6.76%) INFO 02:20:06,640 TraversalEngine - -> 3830446 reads (5.39% of total) failing DuplicateReadFilter INFO 02:20:06,640 TraversalEngine - -> 971487 reads (1.37% of total) failing UnmappedReadFilter
Also, the BAM contains the EOF character indicating its not truncated. Any suggestions on how to solve this?
Thanks, Tota
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •