Re: how to refrain only use certain number of processors
On 20120130_223623, Jochen Spieker wrote:
> lina:
> >
> > Yes. the ultimate goal is:
> >
> > for i in {0..108}
> > do
> > cat A_$i.txt B_$i.txt C_$i.txt -o ABC_$i.txt (output as ABC_$i.txt)
> > done
>
> Ok, so you don't actually have only A_$i filenames, but B_$i and C_$i as
> well. That alone makes my previous approach useless (as I predicted!).
> The other problem is that you need to redirect output (cat doesn't have
> an -o option). This makes things a little bit tricky. The best way to
> deal with both problems is probably to make xargs spawn a new shell
> which receives the current number as positional argument ($1) and uses
> it in multiple places:
>
> $ cat A_1.txt B_1.txt C_1.txt
> a1
> b1
> c1
>
> $ seq 1 3 | xargs --verbose -n1 -P8 -I{} sh -c \
> 'cat "A_$1.txt" "B_$1.txt" "C_$1.txt" > "ABC_$1.txt"' -- '{}'
> sh -c cat "A_$1.txt" "B_$1.txt" "C_$1.txt" > "ABC_$1.txt" -- 1
> sh -c cat "A_$1.txt" "B_$1.txt" "C_$1.txt" > "ABC_$1.txt" -- 2
> sh -c cat "A_$1.txt" "B_$1.txt" "C_$1.txt" > "ABC_$1.txt" -- 3
>
> $ cat ABC_1.txt
> a1
> b1
> c1
>
> This should be quite robust when encountering whitespace in filenames as
> well.
>
> > but here I wish to use only 8 processors at most, total is 16.
> > the administrator of the cluster asked me not to use whole, cause
> > someone else needs SMP server.
>
> Are you sure that your task is CPU-bound? Reading and writing files is
> most probably limited by your storage. Or is cat just another example?
>
> As a sidenote: it took me quite some time to find this solution. I only
> made this effort because I was interested in the solution myself. In the
> future, you should try to present the whole problem upfront or otherwise
> people might get frustrated trying to help you while you keep changing
> the problem. And please trim your quotes more thoroughly.
If I recall correctly, the original for-loop over 108 values contained
a command that ended in an ampersand.
I think bash offers an option to the jobs builtin that causes it to
emit a report of running background jobs. You can combine that with
wc -l to get a count of the number or running jobs. Test this against
your desired upper limit of running background jobs inside a wait loop.
place this loop before the done that ends the main loop over the 108.
This wait loop with keep the code from going on to the next $i value
if there are already enough jobs running to suit your fancy.
Bash has a massive man page. the options for the jobs builtin are in
there somewhere. Bash also has a wait builtin which doesn't do what
you want. Wait command waits for all running jobs to complete.
HTH
--
Paul E Condon
pecondon@mesanetworks.net
Reply to: