What is the Alert "vovlsfd: not enough file descriptors" on FlowTracer?

System Administrator
System Administrator
Altair Employee
edited February 2023 in Altair HPCWorks

When running many parallel jobs with vovlsfd, several default settings may limit your system's performance. One symptom of such a limitation may be the alert  "vovlsfd: not enough file descriptors". This article explains what's happening when this alert is shown, and how to adjust your settings to resolve it.

Note: You need Admin permissions to perform this task.

  1. Open a command prompt by right-clicking on the Start menu and choosing "Command Prompt (Admin)". Enter the following command:
    vovshow -alerts
    This might return the following results:
    sfd > vovshow -alerts ID COUNT LEVEL FIRST LAST MODULE 000237935 625 WARNING 25m42s 3s vovlsfd vovlsfd: not enough file descriptors
    This shows that the vovserver does not have enough file descriptors for the current VOVLSFD(slave,max) setting. See the vovlsfd log file for more information:
    000237936 625 WARNING 25m42s 3s vovlsfd vovlsfd: not enough client connections
    This shows that the vovserver does not have enough client connections (maxNormalClients) for the current VOVLSFD(slave,max) setting. There are several things, which you should address to resolve these issues.
  2. First, look into the location of the vovlsfd logfile. To do so, perform the following steps:
    1. Open an xterm with vovconsole->Tools->Xterm in the directory vovserverdir -p vovlsfd
    2. There is a file for each date vovlsfd.<date>.log
    3. Check the latest file.
    Here is an example of the vovlsfd log:
    vovlsfd 07/28/2017 17:39:02: msg-2: Warning: VOVLSFD(slave,max): 4000 vovlsfd 07/28/2017 17:39:02: msg-2: Warning: Derated available file descriptors + existing slaves: 881 + 225 = 1106 vovlsfd 07/28/2017 17:39:02: msg-2: Warning: The vovserver does not have enough client connections (maxNormalClients) for the current VOVLSFD(slave,max) setting vovlsfd 07/28/2017 17:39:02: msg-2: Warning: VOVLSFD(slave,max): 4000 vovlsfd 07/28/2017 17:39:02: msg-2: Warning: Derated available client connection s + existing slaves: 1652 + 225 = 1877 vovlsfd 07/28/2017 17:39:03: msg-2: Warning: The vovserver does not have enough file descriptors for the current VOVLSFD(slave,max) setting. vovlsfd 07/28/2017 17:39:03: msg-2: Warning: VOVLSFD(slave,max): 4000 vovlsfd 07/28/2017 17:39:03: msg-2: Warning: Derated available file descriptors + existing slaves: 881 + 66 = 947 vovlsfd 07/28/2017 17:39:03: msg-2: Warning: The vovserver does not have enough client connections (maxNormalClients) for the current VOVLSFD(slave,max) setting
  3. If your numbers don't add up in vovlsfd/config.tcl, you should set  VOVLSFD(slave,max) to 4000
    allowing for 4000 simultaneous jobs. However, this alone will not resolve the problem entirely, because there are other limits in place.
  4. You need to allow the needed number of clients, where each slave requires two clients. In policy.tcl, the config(maxNormalClients) is set to 2000, but it should be at least 8000. You should also add spare clients for the vovconsole and the vovlsfd daemons and make it 8100.
    Perform these tasks with the following command:
    vovserverdir -p policy.tcl
    And then run the following command:
    vovproject reread
  5. You will still see the warning with the file descriptor limit, currently set to 1024, so you need to un-limit that so FlowTracer can open all the sockets it needs to communicate with 4000 jobs.
    unlimit descriptors

You're done. The alert should no longer appear.