Re: need help making shell script use two CPUs/cores
On Mon, 10 Jan 2011 12:04:19 -0600, Stan Hoeppner wrote:
> Camaleón put forth on 1/10/2011 8:08 AM:
>> Good. It would be nice to see the results when you finally go it
>> working the way you like ;-)
>
> Bob's xargs suggestion got it working instantly many hours ago. I'm not
> sure of the results you refer to. Are you looking for something like
> "watch top" output for Cpu0 and Cpu1? See for yourself.
Did'nt you run any test? Okay... (now downloading the sample images)
> 2. On your dual processor, or dual core system, execute:
>
> for k in *.JPG; do echo $k; done | xargs -I{} -P2 convert {} -resize
> 3072 {} &
I used a VM to get the closest environment as you seem to have (a low
resource machine) and the above command (timed) gives:
real 1m44.038s
user 2m5.420s
sys 1m17.561s
It uses 2 "convert" proccesses so the files are being run on pairs.
And you can even get the job done faster if using -P8:
real 1m25.255s
user 2m1.792s
sys 0m43.563s
No need to have a quad core with HT. Nice :-)
> Now, to compare the "xargs -P" parallel process performance to standard
> serial performance, clear the temp dir and copy the original files over
> again. Now execute:
>
> for k in *.JPG; do convert $k -resize 3072 $k; done &
This gives:
real 2m30.007s
user 2m11.908s
sys 1m42.634s
Which is ~0.46s. of plus delay. Not that bad.
> and launch top. You'll see only a single convert process running.
> Again, you can wrap this with the time command if you like to compare
> total run times. What you'll find is nearly linear scaling as the number
> of convert processes is doubled, up to the point #processes equals
> #cores. Running more processes than cores merely eats memory wastefully
> and increases total processing time.
Running more processes than real cores seems fine, did you try it?
> Linux is pretty efficient at scheduling multiple processes among cores
> in multiprocessor and/or multi-core systems and achieving near linear
> performance scaling. This is one reason why "fork and forget" is such a
> popular method used for parallel programming. All you have to do is
> fork many children and the kernel takes care of scheduling the processes
> to run simultaneously.
Yep. It handles the proccesses quite nice.
Greetings,
--
Camaleón
Reply to: