Split'N'Trim in handling huge bam file

nancySEEnancySEE malaysiaMember
edited October 2014 in Ask the GATK team

Hi GATK team, i did encountered a problem in running split‘N'Trim on a BAM file (file size up to 27G)
It prompt the error message as follow :

## EEROR MESSAGE:
An error occured when trying to write the BAM file. Usually this happen when there is not enough space in the directory to which data is being written (generally the temp directory) or when your system's open file handle limit is too small. To tell java yo use a bigger/better file system use -Djava.io.tmpdir=X on the command line, The exact error was java.io.FileNotFoundException: _tmp/sortingcollection.8357100622428529694.tmp (Too many open files)

I believed that the file size is too huge to be handled, and i've set the -Djava.io.tmpdir=Xmx4g, it gave the same error message as well. What would you recommend me to do to handle such a huge data?

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @nancySEE‌

    Hi,

    Right now you are not setting -Djava.io.tmpdir to a proper location. If you want to specify a temp directory on a different filesystem, you should specify a location that exists. The X in your example refers to the location.

    If you want to increase the heap size, you can try java -Xmx4g -jar.

    I hope this makes sense.

    -Sheila

  • nancySEEnancySEE malaysiaMember

    Thanks Sheila, I'm apologize regarding of the mistake of my question posted, so what i meant is i've set the heap size to -Xmx4g, and pointed tmpdir to"[working_dir]/_tmp" directory, and it gaves error above. Seems that it stucked due to "too many open files" in _tmp folder.

Sign In or Register to comment.