Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

How to split a paired-end FASTQ file into two separate FASTQ files (forward and reverse)?

I have to analyse a paired-end DNA-seq read that are in an unusual format: both pair-end reads are joined in one FASTQ. I already obtained this file by reverting from BAM file to FASTQ. So, I need to split the file in two separated FASTQ paired-end files. Any comment or suggestion would be appreciated. I know that there is a galaxy tool named FASTQ splitter that can do this for RNA-seq read but not sure this could work for DNA-seq read as well. Thanks

Best Answer

Answers

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @RRafiee,

    See if SamToFastq addresses your need. You will have to invoke the SECOND_END_FASTQ option. I hav

    In general, I suggest you try what you think should be possible with a tool and if it works, great. If it errors, hopefully the error message is informative enough to help you move forward. You can test using a small snippet of your data, e.g. a small chromosome's worth of data.

  • RRafieeRRafiee UKMember

    Hi Shlee, thanks for your advice.
    My aim is to align these reads (interleaved fastq reads) to reference genome according to the GATK procedure and I just realised that I can use the BWA mem with the -p flag for this case.

  • RRafieeRRafiee UKMember

    By using BWA mem and -p flag, I obtained some results (SAM files) but it didn't work for some of input interleaved fastq files. Attached is the screenshot of one of the log files for the case that I didn't get the output SAM file. I think, something is wrong with some of the interleaved fastq files. Am I right?

Sign In or Register to comment.