Bug#174521: Threading bug leading to zombies? / licq problems
[resuming this thread after months. Preserving quoted material because
the list archives have a hole between 18 Sep 2002 and 1 Oct 2002.]
On Sun, Sep 29, 2002 at 06:05:21PM +0200, Christian Jaeger wrote:
> Hello
>
> I've problems running licq on my powerbook G3. Frequently it leaves
> zombies around until at 33 threads (including the zombies) it opens a
> dialog saying that it can't create new threads because of no
> resources being available. (Strangely, ulimit -u says 2560, and a
> perl test script can in fact create that many zombies, so the limit
> must be set somewhere else but I don't care now about that)
>
> I'm noticing some other strange things on my machine:
>
> chris@lombi chris > ps aux|grep ' Z '
> chris 26793 0 0 ? Z Sep24 0:00 [nc <defunct>]
> mysql 28948 0 0 ? Z 06:25 0:00 [mysqld <defunct>]
> mysql 28956 0 0 ? Z 06:25 0:00 [mysqld <defunct>]
> chris 23281 0 0 ? Z 15:26 0:00 [netstat <defunct>]
> (trimmed to fit the line width)
>
> mysqld leaving zombies behind? That must be a bug on my system, it
> would be known pretty fast if that would be a mysql bug.
>
> (netstat is forked off by galeon-bin (huh, what for?? but anyway) so
> that's probably only a bug of galeon)
>
> I'm also seing segfaults of nedit since about the 7th of august. I've
> already reinstalled libmotif and recompiled/upgraded nedit without
> any change: every time it opens a dialog it crashes. I thought that
> maybe some library is damaged but I ran debsums and except libc6
> which doesn't have md5sums everything seemed ok, and even after a
> recent libc6 security upgrade it's still happening (the nedit crash
> as well as the zombies).
>
> So my hypothesises:
>
> - libc6 might have a ppc dependent bug in thread handling.
> - the kernel I'm running has a bug.
>
> Has anyone else seen those problems?
I've been investigating this problem. I need more data for a bug report.
If anyone else is still observing it, please do the following:
* Compile the attached pthread-test.c program with:
'gcc -o pthread-test pthread-test.c -lpthreaad'
* Run it while the zombie problem is occuring. Count how many zombies it
creates. Zombies are reported at the end of the run.
* If you're in X, kill X, run pthread-test again, and see how many
zombies it reports.
* If it's now reporting no zombies, or only one or two zombies:
Create /tmp/foo.sh with the contents:
#!/bin/sh
sleep 1 &
sleep 30
chmod +x it, run it, and while it's running, wait a second, then run
pthread-test in another console, and see how many zombies it reports.
* Send me mail (off-list; I'll summarize) with zombie counts, the kernel
version that you're running, your kernel configuration (if it's not a
pre-compiled Debian kernel), and the results of 'dpkg -l libc6' and
'lsmod'. If there's any other information you think pertinent, please
send it, too.
Thanks in advance!
--
William Aoki waoki@umnh.utah.edu /"\ ASCII Ribbon Campaign
B1FB C169 C7A6 238B 280B <- key change \ / No HTML in mail or news!
99AF A093 29AE 0AE1 9734 prev. expired X
/ \
#include <pthread.h>
#include <errno.h>
#define HOW_MANY_THREADS 10
void threadguts(void *d) {
int i;
printf("Thread %i spawned.\n", (int *) d);
for (i = 0; i < 3; i++) {
printf("Thread %i doing stuff.\n", (int *) d);
}
pthread_exit(0);
}
/* Make a few threads that do stuff, then die. */
int main() {
pthread_t thread[HOW_MANY_THREADS];
pthread_attr_t attr;
void * s;
int i;
int e;
pthread_attr_init(&attr);
for (i = 0; i < HOW_MANY_THREADS; i++) {
printf("Spawning thread %i\n", i);
if (e = pthread_create(&thread[i], &attr, (void *) threadguts, (void *) i)) {
//perror("pthread_create:");
printf("pthread_create: %s\n", strerror(e));
}
}
/* Make sure they're dead. */
for (i = 0; i < HOW_MANY_THREADS; i++) {
printf("Waiting to join thread %i\n", i);
if (pthread_join(thread[i], NULL)) {
printf("Can't join thread %i\n", i);
} else {
printf("Joined thread %i\n", i);
}
}
pthread_attr_destroy(&attr);
/* This won't work if too many threads were created - it will
be unable to fork
*/
system("/bin/ps | grep defunct");
return 0;
}
Reply to: