ReduceReads and scatter/gather

evakoeevakoe Member Posts: 33
edited September 2012 in Ask the GATK team

Hallo everyone,
I have a question about ReduceReads when using scatter/gather. In the argument details of ReduceReads you write for the parameter -nocmp_names:
"... If you scatter/gather there is no guarantee that read name uniqueness will be maintained -- in this case we recommend not compressing."

Do you mean, that if I use scatter/gather, I should use ReduceReads with the -nocmp_names option so that the read names will not be compressed
do you mean that I should not use ReduceReads at all when scatter/gathering.

I assume the first is meant, I just wanted to make sure. Thank you for your time and effort.

Post edited by Geraldine_VdAuwera on

Best Answer

  • Mark_DePristoMark_DePristo Administrator, Dev Posts: 153 admin
    Accepted Answer

    Yes, don't compress read names when running ReduceReads scatter/gathered across chromosomes, of you'll end up with many reads with the same read name (bad for the SAM spec).

    Mark A. DePristo, Ph.D.
    Co-Director, Medical and Population Genetics
    Broad Institute of MIT and Harvard


