The frontline support team will be unavailable to answer questions until May27th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!
MappingQualityZeroFilter filters a large proportion of reads
I browsed through the forum and found some users have the same problem with me.
This is the BaseRecalibrator walker. My command works fine for generating the -grp file and printRead for the new bam files. However, in the standard error files, I found that there are a large proportion of reads fail the MappingQualityZeroFilter.
NFO 05:37:33,854 ProgressMeter - Total runtime 51462.91 secs, 857.72 min, 14.30 hours
INFO 05:37:33,855 MicroScheduler - 263828269 reads were filtered out during the traversal out of approximately 660528125 total reads (39.94%)
INFO 05:37:33,855 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter
INFO 05:37:33,856 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
INFO 05:37:33,856 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter
INFO 05:37:33,856 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter
INFO 05:37:33,857 MicroScheduler - -> 262748250 reads (39.78% of total) failing MappingQualityZeroFilter
INFO 05:37:33,857 MicroScheduler - -> 1080019 reads (0.16% of total) failing NotPrimaryAlignmentFilter
INFO 05:37:33,857 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter
INFO 05:37:35,741 GATKRunReport - Uploaded run statistics report to AWS S3
This makes me hesitate to move on.
My working pipeline followed strictly to the GATK best practice, using BWA mem for alignment and the samtools showed over 90% of reads were mapped the reference genome. I understand that it may be beyond the support. But, I really donot know how to go with this problem. How to tackle it? Can I move on with this? If not, which place to tackle with this problem?
I would like to hear your suggestions.