Is the bwa of the bwaspark tool is the latest 0.7.15 version or not?
The error message does sound like you'll need to supply an interleaved paired reads file. If this is the case, then FastqToSam will do this for you and in the process convert your FASTQ reads to BAM format. This can then be piped from SamToFastq to BWA.
and also, I got the error "A USER ERROR has occurred: Sorry, we only support a single reads input for spark tools for now" when I ran BwaSpark by giving "-I R1.fq -I R2.fq" as parameter. Does that mean I need to merge my paired-end reads first?
Hi @shlee I tried to convert my fastqs to BAM;
Here is my command I used:
./gatk_launch FastqToSam -SM "test" -F1 test.R1.fastq -F2 test.R2.fastq -O test.spark.sam -SO coordinate -R $ref --STRIP_UNPAIRED_MATE_NUMBER true --VALIDATION_STRINGENCY LENIENT -PL ILLUMINA --CREATE_INDEX true
However I got the error "In paired mode, read name 1 (HWI-D00377:30:H8EJDADXX:1:2209:8491:93586) does not match read name 2 (HWI-D00377:30:H8EJDADXX:1:2104:20024:30303)". Seems the tool is picky on the inputs. Is there any way to fix this error? Thanks
The tool assumes each of the fastq files contain the paired reads in identical order, and the GATK4 error sounds like something is amiss in this regard. However, from your command's ./gatk_launch, I see you're using GATK4. This is under alpha release so it's possible the error messaging is off. Just looking at your command, I'd expect the tool to error. The following options would only apply to a BAM that is aligned: -SO coordinate and --CREATE_INDEX true. That is, these options cannot apply to the queryname-sorted unaligned BAM file that your output would be.
I would recommend you stick to using the latest stable release of Picard. Take a look at the Picard documentation for this tool, and use only the options that make sense for your use. If you get the same error, then take a look at the reads that the tool is saying do not match in each file and make sure there isn't a read missing, e.g. at the end of the file. If you can get the command to work with the stable release, and if it's important that you use GATK4, then at this point you can try your working command in the GATK4 format.
Thanks @shlee ! I agree with you and I think it's my input fastqs caused the problem. Becasue I changed to another pair of fastqs and the problem was fixed.
This question was never completely answered. What version of bwa is being used by the BwaSpark tool?
Hi @davidwb, I'll get back to you on how you can find this out on your own.
Our developer says the commit tag is ec85a56 in the Apache branch of the bwa repo https://github.com/lh3/bwa/tree/Apache2. I'm told this corresponds roughly to BWA v0.7.13. This version has not changed in GATK4's development. In the future you will be able to check the BWA version easily using a given jar.
@davidwb, remember that GATK4 is in alpha release. This particular tool in its current state, well, let me just say you should wait for it to come out of experimental status before any production analyses. That being said, if you want to play around with the tool and let us know how it goes, you are of course more than welcome to.