[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

SLURM -- Upgrade from stretch to bullseye



Dear all,

as you can see from this post on the slurm-user mailing list:

https://groups.google.com/g/slurm-users/c/amearC8F6vs

I have been a bit stupid and violently upgraded our cluster from stretch to bullseyse without thinking.... Initially, I was unable to resume slurmctld because the format of the database seems to have changed. Updating mariadb using `mariadb-upgrade -u root -p' did the trick and it seems I can now use the cluster again. Hopefully things will work out smoother than I feared.

I noticed that, since stretch, the name of the packages have changed and I wanted to clean up my system. For instance, running ls -d /etc/slur* returns

/etc/slurm  /etc/slurm-llnl

but the package slurm-llnl is not there anymore. Should I remove this ? (Shouldn't it have been removed by the dist-upgrade?). I am also a bit confused about what packages should be installed on the master and computing nodes. I tried running the examples from

https://slurm.schedmd.com/quickstart.html

(The one with my.script). But it fails with the following outcome:

cat my.stdout
FX31
/tmp/slurmd/job06775/slurm_script: 4: srun: not found
/tmp/slurmd/job06775/slurm_script: 5: srun: not found

Indeed, srun is not available on the computing nodes. Is this the desired installation ? Running dpkg -l | grep slurm on master nodes and computing nodes yields:

Master:
ii  slurm-client                   20.11.7+really20.11.4-2 amd64        SLURM client side commands ii  slurm-wlm-basic-plugins        20.11.7+really20.11.4-2 amd64        SLURM basic plugins ii  slurm-wlm-doc                  20.11.7+really20.11.4-2 all          SLURM documentation ii  slurmctld                      20.11.7+really20.11.4-2 amd64        SLURM central management daemon ii  slurmdbd                       20.11.7+really20.11.4-2 amd64        Secure enterprise-wide interface to a database for SLURM

Computing nodes:
ii  slurm-wlm-basic-plugins        20.11.7+really20.11.4-2 amd64        SLURM basic plugins ii  slurmd                         20.11.7+really20.11.4-2 amd64        SLURM compute node daemon

Is this the correct installation ? I apologize for all these questions but I failed at finding a detailed installation procedure in Debian.

Best,

Julien


Reply to: