UnifiedGenotyper and single end RADtags


I have a RADseq dataset that is single-end reads only. I have been using UnifiedGenotyper to analyze these data, but I have a strange situation where invariant reference sites get a very low quality score (usually between 20-30) whereas polymorphic sites get a high quality score. I suspect this is related to a very high proportion of the data failing the 'DuplicateReadFilter' - about 95%! Because these are single-end reads and they are very redundant, but not necessarily (only) from PCR. I am concerned that GATK is concluding that the redundant reads are due to PCR when in fact they probably are often not due to PCR, and that this would cause homozygous reference sites to have an inappropriately low quality score.

So my question is: Does anyone know a solution to this? Perhaps by turning off the 'DuplicateReadFilter'? (Is this possible?).

I have attached a screen shot of my vcf file, which includes one polymorphic position; you can see the differences in quality scores for the adjacent invariant sites...

Thank you in advance for your help!


Sign In or Register to comment.