Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

Is GATK DMF-compatible?

Our cluster migrates files to tape on a schedule by file-size, which means this can happen anytime before or during a GATK program call (e.g. when the file has not been touched during a long program run). It seems to me that GATK (UnifiedGenotyper, v1.5-30-g27e7e17) is not checking if files are partial but continues with binary zeroes instead of valid data when it tries to read from the offline file. (If it were a C program, they would be looking for "open" system calls which use either of the O_NONBLOCK or O_NDELAY flags.)

GATK does not seem to throw a warning/error and the resulting file (vcf) look OK at first glance except that the DMF system seems to think that the used (offline) file was being treated suspiciously and the overall runtime is inflated.

Thanks for your comment.

Answers

Sign In or Register to comment.