[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#465652: libc6: Occasional failed wakeup in pthread_cond_wait



Package: libc6
Version: 2.7-5
Severity: normal


In my multithreaded application I'm finding calls to pthread_cond_wait
are occasionally not woken by pthread_cond_broadcast.

Some possibly relevent factors:
* This is a single CPU, single core box
* There's typically 1-3 threads calling pthread_cond_wait
* There's a single global cond used, but each thread has their own lock
* A maintenance thread (although the roles change) acquires all of the
  threads' locks, ensuring they're all asleep.  It then calls
  pthread_cond_broadcast, followed by releasing all their locks
* The maintanance thread does this repeatedly, successfully waking up
  other threads from the cond, as well as repeatedly acquiring and
  releasing the hung thread's lock
* I've verified with my own logging and strace that the maintenance
  thread is acquiring the same lock passed to pthread_cond_wait by the
  hung thread
* A snippet from strace's log (full size 43 megs):
  http://pastebin.com/f41c0c791
* I've verified with gdb that the hung thread is in "#0  0xb7edf820 in
  pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0"
* Attaching and detaching gdb causes the hung thread to wakeup and
  finish normally.

I was told of a patch on IRC, but I was later told it did not affect
x86 (which I'm using).  For posterity, here's what I had written:
Additionally, I was told on IRC of a patch set to glibc's locking code
that came out after 2.7.  I haven't verified if these would fix it, or
if they're even related, but it's something to consider.
http://sources.redhat.com/bugzilla/show_bug.cgi?id=5240
Three changed files are linked there.  I was given a 4th on IRC, which
I'm told is a correction.
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/nptl/sysdeps/unix/sysv/linux/lowlevellock.c.diff?cvsroot=glibc&r1=1.18&r2=1.19


-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing'), (1, 'experimental')
Architecture: i386 (i686)

Kernel: Linux 2.6.22-2-k7 (SMP w/1 CPU core)
Locale: LANG=en_CA.UTF-8, LC_CTYPE=en_CA.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages libc6 depends on:
ii  libgcc1                       1:4.2.2-1  GCC support library

libc6 recommends no packages.

-- debconf information excluded


-- 
Adam Olsen, aka Rhamphoryncus



Reply to: