Re: need help making shell script use two CPUs/cores
On Mon, 10 Jan 2011 12:04:19 -0600, Stan Hoeppner wrote:
> Camaleón put forth on 1/10/2011 8:08 AM:
>> Good. It would be nice to see the results when you finally go it
>> working the way you like ;-)
> 
> Bob's xargs suggestion got it working instantly many hours ago.  I'm not
> sure of the results you refer to.  Are you looking for something like
> "watch top" output for Cpu0 and Cpu1?  See for yourself.
Did'nt you run any test? Okay... (now downloading the sample images)
> 2.  On your dual processor, or dual core system, execute:
> 
> for k in *.JPG; do echo $k; done | xargs -I{} -P2 convert {} -resize
> 3072 {} &
I used a VM to get the closest environment as you seem to have (a low 
resource machine) and the above command (timed) gives:
real	1m44.038s
user	2m5.420s
sys	1m17.561s
It uses 2 "convert" proccesses so the files are being run on pairs.
And you can even get the job done faster if using -P8:
real	1m25.255s
user	2m1.792s
sys	0m43.563s
No need to have a quad core with HT. Nice :-)
> Now, to compare the "xargs -P" parallel process performance to standard
> serial performance, clear the temp dir and copy the original files over
> again.  Now execute:
> 
> for k in *.JPG; do convert $k -resize 3072 $k; done &
This gives:
real	2m30.007s
user	2m11.908s
sys	1m42.634s
Which is ~0.46s. of plus delay. Not that bad.
> and launch top.  You'll see only a single convert process running. 
> Again, you can wrap this with the time command if you like to compare
> total run times. What you'll find is nearly linear scaling as the number
> of convert processes is doubled, up to the point #processes equals
> #cores.  Running more processes than cores merely eats memory wastefully
> and increases total processing time.
Running more processes than real cores seems fine, did you try it?
 
> Linux is pretty efficient at scheduling multiple processes among cores
> in multiprocessor and/or multi-core systems and achieving near linear
> performance scaling.  This is one reason why "fork and forget" is such a
> popular method used for parallel programming.  All you have to do is
> fork many children and the kernel takes care of scheduling the processes
> to run simultaneously.
Yep. It handles the proccesses quite nice.
Greetings,
-- 
Camaleón
Reply to: