Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Adjust ulimit open file handle limit on Mac OS X Sierra for GATK VariantsToBinaryPed

gaelgarciagaelgarcia Member
edited May 2018 in Ask the GATK team

Hi,

I'm trying to run the VariantsToBinaryPed tool from GATK3, but it seems that my open file handle limit is too small for it to successfully run.

I've tried increasing the limit, as shown below, but the command still fails.

Is there anything else I can do to avoid this error?

Thank you.

The command:

     > java -jar GenomeAnalysisTK.jar \
       -T VariantsToBinaryPed \
       -R Homo_sapiens_assembly38.fasta \
       -V  ~/vcf/snp.indel.recal.splitMA_norm.vcf.bgz\
       -m ~/03_IdentityCheck/KING/targeted_seq_ped_clean.fam\
       -bed output.bed\
       -bim output.bim\
       -fam output.fam\
       --minGenotypeQuality 0

returns this error:

    ERROR MESSAGE: An error occurred because there were too many files 
    open concurrently; your system's open file handle limit is probably too small.  
    See the unix ulimit command to adjust this limit or 
    ask your system administrator for help.

Following the advice given here, I ran:

    echo kern.maxfiles=65536 | sudo tee -a /etc/sysctl.conf
    echo kern.maxfilesperproc=65536 | sudo tee -a /etc/sysctl.conf
    sudo sysctl -w kern.maxfiles=65536
    sudo sysctl -w kern.maxfilesperproc=65536
    sudo ulimit -n 65536 65536

and added this line to my .bash_profile and sourced it:

ulimit -n 65536 65536

So that now, when I run ulimit -n, I get:

65536

However, I still get the same error from GATK:

    ERROR MESSAGE: An error occurred because there were too many files 
    open concurrently; your system's open file handle limit is probably too small.  
    See the unix ulimit command to adjust this limit or 
    ask your system administrator for help.

Answers

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭

    su yourself. Then ulimit will be unlimited. Alternative would be using docker to run gatk.

  • Thanks @SkyWarrior - what do you mean by "su yourself" ?

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭

    Type su username and press enter. It will ask your password. Then you will be using the system with elevated priviledges. Be careful great elevation comes with great responsibility.

  • gaelgarciagaelgarcia Member
    edited May 2018

    Thanks - still get the same error after su gaelgarcia.

    It looks like ulimit open files is not affected by this...

    bash-3.2$ ulimit -a
    
    core file size          (blocks, -c) 0
    data seg size           (kbytes, -d) unlimited
    file size               (blocks, -f) unlimited
    max locked memory       (kbytes, -l) unlimited
    max memory size         (kbytes, -m) unlimited
    open files                      (-n) 200000
    pipe size            (512 bytes, -p) 1
    stack size              (kbytes, -s) 8192
    cpu time               (seconds, -t) unlimited
    max user processes              (-u) 2048
    virtual memory          (kbytes, -v) unlimited
    
  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭

    You can try the docker image.

  • gaelgarciagaelgarcia Member
    edited May 2018

    Hi @SkyWarrior,

    Apologies for the late reply --

    I was able to run VariantstoBinaryPed with the Docker image of GATK3 as you suggested, thank you!

    However, I'm still not getting the desired output from the tool -- the .bim and .bed files are empty. However, the .fam file is being read in, subset and ordered correctly according to the SampleIDs of the VCF.

    I get the following error message:

    INFO 22:50:32,289 ProgressMeter - done 0.0 10.0 s 18.0 w 100.0% 10.0 s 0.0 s

    INFO 22:50:32,289 ProgressMeter - Total runtime 10.92 secs, 0.18 min, 0.00 hours

    Done. There were 1 WARN messages, the first 1 are repeated below.

    WARN 22:50:19,495 IndexDictionaryUtils - Track variant doesn't have a sequence dictionary built in, skipping dictionary validation

    I am wondering if this issue is related to this - the thread is from a few years ago but perhaps the issue has not been solved?

    Thanks again.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @gaelgarcia
    Hi,

    Let's see if we can help you in this thread.

    -Sheila

Sign In or Register to comment.