GenotypeGVCFs: running out of memory with >50 samples
I used the HaplotypeCaller with "-dcov 500 --emitRefConfidence GVCF --variant_index_type LINEAR --variant_index_parameter 128000" to produce 60 gvcf files, that worked fine.
However, GenotypeGVCFs gets stuck on a position and runs out of memory after about 24hours, even when I allocate 240Gb. Testing a short region of 60kb does not help. Here was my command line:
software/jre1.7.0_25/bin/java -Xmx240g -jar GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar -T GenotypeGVCFs -R Reference.fasta -L chrom14:2240000-2300000 --variant 60samples_gvcf.list -o output.vcf
If I split my list of 60 gvcf files into two lists of 30 samples each, GenotypeGVCFs works fine for both batches within 15 minutes (~10Gb of memory).
I tested with 47 samples, it took 8 hours (31gb of memory) for a 60kb region. Once I use more than ~55 samples, it takes forever and crashes.
Any help will be much appreciated!