How do I get multiple pbs_tmrsh commands to run in parallel?

It have "pbs_tmrsh" commands executing in a loop. The problem I'm facing is that one "pbs_tmrsh" command has to finish before the next one is executed. I need for all of them to run in parallel, as they are part of the same job and the job needs for all of them to be running concurrently.
Here's my script that simply runs the hostname and date commands on the allocated nodes.
#PBS -l select="6:ngpus=1" # Show the allocated nodes in the ${PBS_NODEFILE}. # Remove the duplicate nodes to avoid running multiple tasks on each node, for host in $(sort -u "${PBS_NODEFILE}") |
You can see by the output that the second "pbs_tmrsh" command ran after the 60 second sleep from the first "pbs_tmrsh" command completed.
[corujor@node003 ~]$ cat pbs_batch10.sh.o822 hostname=node002 : date=Tue Sep 13 14:59:06 CDT 2022 |
I tried adding an ampersand to the end of the line in an effort to run the "pbs_tmrsh" command in the background to allow the next "pbs_tmrsh" run in parallel, as shown below.
pbs_tmrsh $host /bin/bash -c 'echo "hostname=$(hostname) : date=$(date)"; sleep 60' &
However, the "pbs_mom" daemon on the compute node kills the job immediately when the ampersand is used.
09/13/2022 15:18:29;0008;pbs_mom;Job;823.node003;JOIN_JOB as node 1 |
Is there a way to execute multiple pbs_tmrsh commands that are part of the same job concurrently?
Thank you.
Rigoberto
Answers
-
I resolved this issue. Putting an "&" at the end of the "pbs_tmrsh" command to create a background task, plus adding a "wait" after the "for-loop" to prevent the batch script from exiting until the background tasks complete, solves the problem.
1