Reduced Representation

rdubin

Is it possible to use Genome Strip (preprocess, discovery, genotyper) on reduced representation data? In this case, genomic DNA was restricted with PacI, a rare cutter, and for each of 16 samples, the restricted DNA was run on a gel and a specific size range was excised for each sample and purified; it was this gel-excised DNA that was used for library construction and sequencing. If it's possible to use Genome Strip on such data, could you please tell me how to set -P input.genomeSize, -P input.genomeSizeMale, and -P input.genomeSizeFemale and how to set the regions that we wish to examine. We have over 3000 specific regions that were selected and that we wish to examine, and I know the total size of these regions; should I be using the sum of these regions in the input.genomeSize parameters? The regions are on all of the major chromosomes. However, when I provide these 3000 regions to discovery (using the -L parameter pointing to a file containing the 3000 regions), the module fails during MergeDiscoveryOutput, apparently due to running out of memory (I imagine discovery cannot open 3000 vcf files at once). (Note that I also tried to use -L with multiple regions during Preprocessing, and this fails too.) Any assistance would be greatly appreciated.

