We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

about validation of SAM files per chromosome

BogdanBogdan Palo Alto, CAMember ✭✭

Dear Geraldine, et all, happy new week ! I am writing to ask for a piece of help:

again, I have split a BAM file per chomosome using NGSUTILS http://ngsutils.org/. However, when I am aiming to validate the resulting BAM files (per chromosome), ValidateSAM gives the following messages below. Please could you advise whether there is a way to fix these files (would FixMAte Information tool do this ?). Thanks !

The errors were :

HISTOGRAM java.lang.String

Error Type Count

HISTOGRAM java.lang.String

Error Type Count


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Bogdan, if those errors were produced by splitting the bams by chromosome, the most likely explanation is that those are so-called chimeric read pairs where the mates are on different chromosomes. If so there is nothing that FMI can do. Depending on what data you're working with and what you're looking for, you may not care and be willing to throw them out, or you may actually want to preserve them (if you're eg looking for structural variants, where chimeric read pairs are a sign of chromosomal rearrangement).

  • BogdanBogdan Palo Alto, CAMember ✭✭

    Dear Geraldine, thank you ! yes, I could potentially "mask" the reads that are part of chimeric read-pairs. We may need those reads later when running BreakDancer.

    Please could I ask if there is any way to "mask" the chimeric reads in such a way that ValidateSam gives OK to the files for down-stream analyses ?

    I have another question, but will place it in a new track. Many thanks again for helpful suggestions !

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    Hi Bogdan,

    I suspect you are asking about masking the error-causing reads so GATK will not crash. You don't have to worry about the MATE_NOT_FOUND errors from ValidateSamFile, as GATK will not crash on those.


  • BogdanBogdan Palo Alto, CAMember ✭✭

    great thanks Sheila for reassurance ;)

Sign In or Register to comment.