We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Is there a way to remove the "|" from the SQ line in a .bam file?

grtaveragrtavera Case Western Reserve UniversityMember

We have several, unique H. pylori genomes. We have aligned each of them to a reference genome, which had "|"in the fasta (or .fa) file name. Now, when running through GATK, we cannot create our final .vcf file. Do we have to remove the "|" from the .fa file and rerun everything from the beginning, or is there another approach?



  • valentinvalentin Cambridge, MAMember, Dev ✭✭

    Could you post the exception you are getting and extract of the header that is causing the issue? It sounds like a bug... or perhaps an obscure violation of the SAM format but I guess is the former.

  • valentinvalentin Cambridge, MAMember, Dev ✭✭

    It seems that a similar issue has been reported here. There you will find a workaround that you be able to adapt to your situation.

  • valentinvalentin Cambridge, MAMember, Dev ✭✭

    Also could you confirm what version of GATK you are using?

  • valentinvalentin Cambridge, MAMember, Dev ✭✭

    It seems that the that I posted above actually does not work, here is the workaround:

    It seems that something similar has been reported here.

    That might be a bug in GATK, so thanks for reporting. I guess the work around with samtools would be:

    samtools view -h input.bam | sed 's/SN:gi\|[0-9]*\|gb\|\(.*\)\|/SN:\1/' | samtools view -b - > output.bam

    You may need to add more 'sed' commands if there is SNs that follow a different regular expression. You can check on whether the
    'sed' is doing the right think like so:

    samtools view -H input.bam | sed 's/SN:gi\|[0-9]*\|gb\|\(.*\)\|/SN:\1/'

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭


    Please do have a look at the threads Valentin pointed to above. Also, why do you say you cannot create the final VCF? Are you getting an error message? If so, please post it.


Sign In or Register to comment.