If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Is it possible to disable DuplicateReadFilter in UnifiedGenotyper?
My understanding of DuplicateReadFilter is that is removes PCR duplicates (ie: identical reads that map to the same location in the genome). Is there a way to prevent UnifiedGenotyper from using this filter?
Many of my 'duplicate' reads are not really duplicates. I have 50bp single ended yeast RNA-seq data, and 10 samples/lane results in highly over-sampled data. Removing duplicates results in an undercounting of reads at the most highly expressed genes, and therefore an increased number of sub-clonal variants at these genes (because the reads from the major clonal population are discarded, but reads of sub-clonal populations are kept because they have a different sequence). At least I think that is what is happening.