[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: depending on a customized library



>>>>> "Michael" == Michael K Edwards <m.k.edwards@gmail.com> writes:

Michael> On 5/18/05, Hubert Chan <hubert@uhoreg.ca> wrote:
Hubert> Commenting out those lines, and compiling multi-threaded, gives
Hubert> performance similar to the single-threaded case.  So what does
Hubert> this mean?  I doubt that Ryan will want to disable
Hubert> THREAD_LOCAL_ALLOC Debian-wide.

BTW, I'll ask my upstream to try it too and see if his results agree
with mine.

Michael> It means someone ought to beat on the spin-then-queue locking
Michael> implementation enabled by THREAD_LOCAL_ALLOC until it isn't
Michael> retrograde for the common single-threaded case.  That's really
Michael> a job for oprofile, which I'm starting to get spun up on now;
Michael> but code inspection, informed by some knowledge about NPTL,
Michael> might be enough.

OK.  That's probably beyond me at the moment.

Michael> By the way, if you want to use oprofile, you might as well use
Michael> the 0.8.2 release.  ...

I'll take a look at that if/when I get around to looking at oprofile.
It looks more complicated than what I want to look at right now.  (I
have other things that need to be looked at.)

Hubert> I also tried compiling with THREAD_LOCAL_ALLOC, but using
Hubert> GC_local_malloc instead of GC_malloc, but performance is similar
Hubert> to just using GC_malloc.

Michael> From http://www.hpl.hp.com/personal/Hans_Boehm/gc/scale.html :

scale.html> The easiest way to switch an application to thread-local
scale.html> allocation is to

scale.html> 1. Define the macro GC_REDIRECT_TO_LOCAL, and then include
scale.html> the gc.h header in each client source file.

Yup, did that.

scale.html> 2. Invoke GC_thr_init() before any allocation.

That seems to be a typo.  It should be GC_init().  If I just call
GC_thr_init, I get a segfault when I try to allocate memory.

scale.html> 3. Allocate using GC_MALLOC, GC_MALLOC_ATOMIC, and/or
scale.html> GC_GCJ_MALLOC.

My upstream redefines GC_MALLOC so that it throws an exception (C++) if
allocation fails.  So I just edited his re-definition to call
GC_local_malloc instead of GC_malloc (which is what GC_REDIRECT_TO_LOCAL
does anyways).

Michael> Oddly, -DPARALLEL_MARK may improve the situation for UP
Michael> thread-local allocation, because it results in the use of an
Michael> implementation of GC_malloc_many (used to refill thread-local
Michael> free lists) that may be better tuned for thread-local usage
Michael> patterns (as well as more concurrent).

Hmm.  I'll take a look at that.

-- 
Hubert Chan <hubert@uhoreg.ca> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7  5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.



Reply to: