We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Random java.io.FileNotFoundException version 2.7-2-g6bda569

Hi All
We are running into some random weirdness when running jobs using SGE, GATK version 2.7-2-g6bda569, pretty much all GATK tools - but mostly IndelRealigner abd UnifiedGenotyper, we often get the following error:-

ERROR MESSAGE: Couldn't read file /scratch/project/pipelines/novorecal.bam because java.io.FileNotFoundException: /scratch/project/pipelines/novorecal.bam (No such file or directory)

This also happens for supplied reference genomes and vcf files. The GATK tool cant find them.

These "missing" files do exist, and have often even been created by the previous tool/step in the pipeline.

When we re-run the pipeline on a failed sample, it works. So we end up having to re-run our pipeline on the same set of samples multiple times and are beginning to find this very frustrating. These errors seem to be random, I cant find any pattern, and as I mentioned, when we re-run the pipeline on a failed run, it work without a hitch.

Has anyone experienced this? And if so, any recommendations?

Please help



  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Steve,

    This sounds like a quirk of your platform. There are a few things you can check to troubleshoot this. For example, confirm that directories named "/scratch" are actually available across NFS-- assuming you're using network mounts. If they are NFS mounts, when do they appear? You can also try adding ls $file path before executing your GATK jobs at each step. Queue has an option at some point called "wait for parts before gather" because newly created files sometimes take a few seconds to become available on the filesystem. Considering you said re-running always works off the bat, that may well be what's happening here.

Sign In or Register to comment.