How to split a paired-end FASTQ file into two separate FASTQ files (forward and reverse)?

I have to analyse a paired-end DNA-seq read that are in an unusual format: both pair-end reads are joined in one FASTQ. I already obtained this file by reverting from BAM file to FASTQ. So, I need to split the file in two separated FASTQ paired-end files. Any comment or suggestion would be appreciated. I know that there is a galaxy tool named FASTQ splitter that can do this for RNA-seq read but not sure this could work for DNA-seq read as well. Thanks

Best Answer


  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @RRafiee,

    See if SamToFastq addresses your need. You will have to invoke the SECOND_END_FASTQ option. I hav

    In general, I suggest you try what you think should be possible with a tool and if it works, great. If it errors, hopefully the error message is informative enough to help you move forward. You can test using a small snippet of your data, e.g. a small chromosome's worth of data.

  • RRafieeRRafiee UKMember

    Hi Shlee, thanks for your advice.
    My aim is to align these reads (interleaved fastq reads) to reference genome according to the GATK procedure and I just realised that I can use the BWA mem with the -p flag for this case.

  • RRafieeRRafiee UKMember

    By using BWA mem and -p flag, I obtained some results (SAM files) but it didn't work for some of input interleaved fastq files. Attached is the screenshot of one of the log files for the case that I didn't get the output SAM file. I think, something is wrong with some of the interleaved fastq files. Am I right?

