[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: how to refrain only use certain number of processors

On 20120130_223623, Jochen Spieker wrote:
> lina:
> > 
> > Yes. the ultimate goal is:
> > 
> > for i in {0..108}
> > do
> > cat A_$i.txt B_$i.txt C_$i.txt -o ABC_$i.txt  (output as ABC_$i.txt)
> > done
> Ok, so you don't actually have only A_$i filenames, but B_$i and C_$i as
> well. That alone makes my previous approach useless (as I predicted!).
> The other problem is that you need to redirect output (cat doesn't have
> an -o option). This makes things a little bit tricky. The best way to
> deal with both problems is probably to make xargs spawn a new shell
> which receives the current number as positional argument ($1) and uses
> it in multiple places:
> $ cat A_1.txt B_1.txt C_1.txt 
> a1
> b1
> c1
> $ seq 1 3 | xargs --verbose -n1 -P8 -I{} sh -c \
>     'cat "A_$1.txt" "B_$1.txt" "C_$1.txt" > "ABC_$1.txt"' -- '{}'
> sh -c cat "A_$1.txt" "B_$1.txt" "C_$1.txt" > "ABC_$1.txt" -- 1 
> sh -c cat "A_$1.txt" "B_$1.txt" "C_$1.txt" > "ABC_$1.txt" -- 2 
> sh -c cat "A_$1.txt" "B_$1.txt" "C_$1.txt" > "ABC_$1.txt" -- 3 
> $ cat ABC_1.txt 
> a1
> b1
> c1
> This should be quite robust when encountering whitespace in filenames as
> well.
> > but here I wish to use only 8 processors at most, total is 16.
> > the administrator of the cluster asked me not to use whole, cause
> > someone else needs SMP server.
> Are you sure that your task is CPU-bound? Reading and writing files is 
> most probably limited by your storage. Or is cat just another example?
> As a sidenote: it took me quite some time to find this solution. I only
> made this effort because I was interested in the solution myself. In the
> future, you should try to present the whole problem upfront or otherwise
> people might get frustrated trying to help you while you keep changing
> the problem. And please trim your quotes more thoroughly.

If I recall correctly, the original for-loop over 108 values contained
a command that ended in an ampersand.

I think bash offers an option to the jobs builtin that causes it to
emit a report of running background jobs. You can combine that with
wc -l to get a count of the number or running jobs. Test this against
your desired upper limit of running background jobs inside a wait loop.
place this loop before the done that ends the main loop over the 108.
This wait loop with keep the code from going on to the next $i value
if there are already enough jobs running to suit your fancy.

Bash has a massive man page. the options for the jobs builtin are in
there somewhere. Bash also has a wait builtin which doesn't do what
you want. Wait command waits for all running jobs to complete. 


Paul E Condon           

Reply to: