How to run the job script on the Singularity container_?

STa
STa New Altair Community Member
edited March 1 in Community Q&A

Hi all, I deployed PBS pro(free demo) cluster with 2 host.

  • node138: altair license manager, server host, execution host
  • node139: execution host

And installed Singularity on both node and I have confirmed it worked.

[pbsdata@node139 ~]$ singularity exec /home/pbsdata/singularity_img/lolcow.sif cowsay moo
 _____
< moo >
 -----
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

 

My goal is that job script command run on the singularity container without singularity command like "singularity exec ***.sif command" in order to entrust container management to PBS.

In the Administrator's Guide, there is Singularity sample command;

qsub -v CONTAINER_IMAGE=pbsuser/test-image 

Then I prepared Singularity image file and job script.

#!/bin/sh
#PBS -N lolcowJob
#PBS -l select=1:ncpus=1:host=node139:container_engine=singularity,walltime=00:01:00
#PBS -o output.txt
#PBS -e error.txt
#PBS -v CONTAINER_IMAGE="/home/pbsdata/singularity_img/lolcow.sif"
cowsay moo

But it seems that the container does not start and the command in the container not found.

/var/spool/pbs/mom_priv/jobs/554.node138.SC: line 1: cowsay: command not found

Could you tell me where is my misunderstanding?

Thank you.

Answers

  • STa
    STa New Altair Community Member
    edited February 22

    I would like to add some information of my environment.

    • OS is Rocky Linux 9.3.
    • pbs_version = 2022.1.4.20231010124201
    • singularity-ce version 4.1.0
    • Singularity image I am using for test is created by following.

    singularity build lolcow.sif library://sylabs-jms/testing/lolcow

    If there is a lack of information, please let me know.

    Thank you.

  • STa
    STa New Altair Community Member
    edited March 1

    I solved the problem.

    I forgot to add container engine to resources_available

       qmgr -c "s n node139 resources_available.container_engine += singularity"

     

    And I suppose two cases.

    1. Use Public repository

    Image URI: library://sylabs-jms/testing/lolcow

    Confiture container_image_source in the container hook configuration and spesify  CONTAINER_IMAGE excluding container_image_source.

       # grep container_image_source    /var/spool/pbs/server_priv/hooks/PBS_hpc_container.CF
               "container_image_source": "library://",

       $ grep CONTAINER_IMAGE lolcow.batch
       #PBS -v CONTAINER_IMAGE=sylabs-jms/testing/lolcow

     

    2. Use Image file on the server

    Image Path: /home/pbsdata/singularity_img/lolcow_latest.sif

    Confiture container_image_source in the container hook configuration and specify  CONTAINER_IMAGE excluding container_image_source.

       # grep container_image_source    /var/spool/pbs/server_priv/hooks/PBS_hpc_container.CF
               "container_image_source": "/home/pbsdata/singularity_img/",

       $ grep CONTAINER_IMAGE lolcow.batch
       #PBS -v CONTAINER_IMAGE=lolcow

    Note that the image file you want to use has to be named like "*_*.sif", because PBS will change the name requested by CONTAINER_IMAGE when looking for it.

    When you specify a name without ":", PBS will looking for the file attached "_latest.sif" at the end of the name.

       CONTAINER_IMAGE=lolcow -> /home/pbsdata/singularity_img/lolcow_latest.sif

       CONTAINER_IMAGE=lolcow.sif -> /home/pbsdata/singularity_img/lolcow.sif_latest.sif

    On the other hand, when you specify a name with ":", PBS will replace it into "_", attach it and the right part to the file name and look for it.

       CONTAINER_IMAGE=lolcow:sample -> /home/pbsdata/singularity_img/lolcow_sample.sif

    Thank you.