The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.
Register now for the upcoming GATK Best Practices workshop, Feb 20-22 in Leuven, Belgium. Open to all comers! More info and signup at http://bit.ly/2i4mGxz

IndelRealign and baseRecalibration on a large whole genome sample

elondinelondin Member Posts: 5

Hi,
I am doing some whole genome sequence on 2 samples in which each sample was run on 12 lanes of a SOLiD 5500 machine. These are at fairly high coverage of ~40x each. My plan was align each lane independently then merge all 12 lane for each sample into 1 large bam file, then do the post-processing. I did this and was able to do the indel realign on both samples but have been having trouble with the base recalibration step, in which when apply the base recalibration PrintReads crashes with an error saying that there is not enough memory available. I have tried changing the tmp directory that is used and any other trick that I have been able to find on the forum.

I was wondering if an alternate and suitable approach would be to perform all of the post-processing on each individual lane first, then merge all of the lanes together after that. Would doing that have any adverse affect on the downstream analysis, i.e snps, cnvs, translocations, etc.

Thanks

Answers

Sign In or Register to comment.