# RUNTIME ERROR(version 1.5-16-g58245bf) - MESSAGE:An error occurred during the traversal.

I'm running the UnifiedGenotyper on a large number of metagenomic samples (~110) that have been mapped by BWA against a database of 671 reference genomes. I know that the UG is meant only for diploid use, but I was hoping to use the vcf output to get at base frequencies at specific loci (as opposed to the probability of whatever diploid genotype is predicted).

I'm running the UG with a 63Gb memory allocation, using 12 threads. Here is the error message I received:

##### ERROR stack trace

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: An error occurred during the traversal.
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.regex.Pattern$Start.(Pattern.java:3043) at java.util.regex.Pattern.compile(Pattern.java:1480) at java.util.regex.Pattern.(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:823) at java.lang.String.split(String.java:2292) at net.sf.samtools.SAMTextHeaderCodec$ParsedHeaderLine.(SAMTextHeaderCodec.java:272)
at org.broadinstitute.sting.gatk.datasources.reads.SAMDataSource$ReaderInitializer.call(SAMDataSource.java:933) at org.broadinstitute.sting.gatk.datasources.reads.SAMDataSource$SAMReaders.(SAMDataSource.java:794)
at org.broadinstitute.sting.gatk.datasources.reads.SAMDataSource$SAMResourcePool.createNewResource(SAMDataSource.java:753) at org.broadinstitute.sting.gatk.datasources.reads.SAMDataSource$SAMResourcePool.getAvailableReaders(SAMDataSource.java:724)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

I'm worried that this error might be caused by my use of the UG for such a non-traditional experiment (100+ metagenomic samples mapped to 671 reference genomes), but I wanted to check here to see if maybe it was a more mundane bug, and the UG should otherwise be working for my input (albeit with the genotypes & probabilities called under the false assumption that my data is diploid ... all I want is the base frequencies at snp loci)

Hi there,

Geraldine Van der Auwera, PhD