Troubleshoot the Database Startup (NetworkComputer & LicenseMonitor)

AlanB_22262
AlanB_22262 New Altair Community Member
edited February 2023 in Altair HPCWorks

When something goes wrong with the database startup, review the following logs for information.

These files are found in the directory $SWD/vovdbd.

  • vovdbd.*.log - The daemon log, which will show if the request to start failed.
  • vovdb.*.log - The database utility log, which will show if the start itself failed.
  • startdb.*.log -  A condensed version of the start output.

If the logs do not yield anything useful, try the following actions:

  1. Increase the verbosity of database-related output. Add: set ::vovutils(verbose)5 to the vovdbd/config.tcl file.
  2. Run the vovdb_util startdb -v -v -v command manually, after enabling your shell for the project.
    Check for 'postgres' processes using ps() and grep().
  3. Run the low-level database wrapper from a project-enabled shell: vovdb -v -v -v.
    The system will attempt (at the lowest possible level) to start the database. If you get to this point and it still does not work, there may be a problem on the Postgres side.
  4. Review dmesg, /var/log/messages, and other system messages for clues.
    We had a case where the database would hang at 'Starting...' at a customer.  It would update the licmon.swd/db/config.tcl file and then stall.

The issue was eventually traced to an NFS mount from a machine that had just been decommissioned.  The same machine was also a Samba server, and this led to smbd processes not terminating properly, eventually putting the LM 'parser' vovslave in OVRLD, which stalled license data collection by sampling.