GPU integration with UGE

Sylvain Korzennik
Sylvain Korzennik New Altair Community Member
edited March 18 in Community Q&A

Is there a comprehensive document that explain how to integrate compute nodes with GPUs with GE 2023.1 (8.8.1)?

I have DCGM running (3.1.8.1), and have adjusted complex and host conf based on pp 215-216 of the admin guide, hence qconf -se, qstat and qacct gives me cuda/gpu info, but the gpu_usage filed remains NONE (qstat and qacct). I did run dcgmi stats -v -e, but this is pure guesswork, and did not enable gpu_usage.

I've stumbled on the 3 pages https://altair.com/newsroom/articles/GPU-Sharing-with-Altair-Grid-Engine---Part-[I|II|III], but it looks like some info is absent, ie, in part-I:

The qconf -mc command (modify complex) enables us to review and edit existing complex resource definitions cluster-wide, and add a new gpu entry as shown:

[some here must be missing]

If the gpu complex resource entry already exists, you can also review and edit it using the qconf -mce command.

and the links to the different parts are wrong/broken!!! 

I'm looking for real  and comprehensive documentation on what to do to get all the gpu relevant config and info.