Anybody familier with GNU Queue?
Hi All,
I'm trying to set up a machine farm using GNU Queue as the job scheduler,
but I'm having difficulty understanding exactly what I need to do.
The situation is...
I have a number of machines, lets call them MachineA, MachineB, MachineC,
etc.
So my qhostsfile looks like this:
---------------------------------------------
MachineA
MachineB
MachineC
MachineD
MachineE
MachineF
MachineG
MachineH
MachineI
Machinej
---------------------------------------------
Machines A-F are single processor 3GB of RAM
Machines G-J are dual processor 2GB of RAM
I have a number of different types of job that I want to run, each of which
require licences that I have a fixed number of. For example, only 6 jobs of
Type 1 can run simultaneously, and only 8 of job Type 2. All jobs are
expected to be run non-interactivly.
JobType 1 requires a machine with 3GB of RAM, and so mustn't be allocated a
2GB box or share with other jobs.
JobType 2 can run on 2GB boxes and can share.
My plan is to make a queue for each different job type.
# cd /var/lib/queue
# ls -l
drwxr-xr-x 3 root root 512 Feb 27 12:00 JobType1
drwxr-xr-x 3 root root 512 Feb 27 12:01 JobType2
drwxr-xr-x 3 root root 512 Feb 27 11:50 now
drwxr-xr-x 3 root root 512 Jan 16 16:01 wait
So now I'm comming to write the profiles for the queues and I'm getting
stuck (primarilly because the documentation is little more than reference
material). I've copied what I've got down below.
Do these look about right?
Is there any way to ensure jobs don't get started on machines that have only
a small amount of free memory?
Do the rlimit variables set limits (i.e. like the shell limit command)?
Is there any other setup I need to do?
I don't have access to the machine farm at the moment, so I'm trying to
set-up as much as possible before hand, hence I can try any of this just
yet. Any insights anybody can give me will be useful.
Thanks
Paul
-------------------- Job Type 1 Profile ---------------------
exec on
mail /var/lib/queue/JobType1/mail_log
supervisor /var/lib/queue/JobType1/mail_log2
host MachineA pfactor 100
host MachineB pfactor 100
host MachineC pfactor 100
host MachineD pfactor 100
host MachineE pfactor 100
host MachineF pfactor 100
host MachineG pfactor 1
host MachineH pfactor 1
host MachineI pfactor 1
host MachineJ pfactor 1
maxexec 6
host MachineA vmaxexec 1
host MachineB vmaxexec 1
host MachineC vmaxexec 1
host MachineD vmaxexec 1
host MachineE vmaxexec 1
host MachineF vmaxexec 1
host MachineG vmaxexec 0
host MachineH vmaxexec 0
host MachineI vmaxexec 0
host MachineJ vmaxexec 0
-------------------- Job Type 2 Profile ---------------------
exec on
mail /var/lib/queue/JobType2/mail_log
supervisor /var/lib/queue/JobType2/mail_log2
host MachineA pfactor 1
host MachineB pfactor 1
host MachineC pfactor 1
host MachineD pfactor 1
host MachineE pfactor 1
host MachineF pfactor 1
host MachineG pfactor 200
host MachineH pfactor 200
host MachineI pfactor 200
host MachineJ pfactor 200
maxexec 8
host MachineA vmaxexec 0
host MachineB vmaxexec 0
host MachineC vmaxexec 0
host MachineD vmaxexec 0
host MachineE vmaxexec 0
host MachineF vmaxexec 0
host MachineG vmaxexec 2
host MachineH vmaxexec 2
host MachineI vmaxexec 2
host MachineJ vmaxexec 2
-------------------------------------------------------------
--
Paul Sargent
mailto: Paul.Sargent@3Dlabs.com
Reply to: