threads uses crash debian
We have been testing the Cyclone News router from highwind
(www.highwind.com) for the past month and a half. And it has been less
them stable. Highwind seems to have tracked it down to the problem that is
causing the problem that is discribed below. I ran the included program
and it did indeed crashed the box. Any ideas help or fixes would be
helpful. Please cc all replys to me as well as support@highwind.com
thanks
paonia
---------
After literally a month of debugging and torture testing, I have isolated the
kernel/machine crash to a specific set of circumstances and produced the
program below which reproduces it.
Here's the skinny:
Under moderate load, we have been seeing one or more of the following:
1) SEGV or Bus Error in some random piece of code
2) Total machine lockup
3) kernel messages similar to the following:
zonda kernel: general protection: 0000
Jan 28 15:00:28 zonda kernel: CPU: 0
Jan 28 15:00:28 zonda kernel: EIP: 0010:[tcp_ack+678/2244]
Jan 28 15:00:28 zonda kernel: EFLAGS: 00010206
...
lots of registers
...
Jan 28 15:00:28 zonda kernel: Call Trace: [udp_v4_rehash+62/104] [shm_close+163/236] [udp_v4_lookup+129/184] [sys_shmat+45/716] [move_addr_to_user+123/156] [do_signal+309/632]
This has been reported on at least 4 different machines with a variety of
different hardware, kernels and libc's up to and including 2.0.33 with
glibc-2.0.6-8.
It appears that the kernel and/or libc is having difficulty under the
following circumstances:
1) High TCP load.
2) High DNS load. Having one or more simultaneous outstanding DNS requests
seems to greatly increase the probability of a crash, though we HAVE
reproduced it with only 1 DNS thread.
Below is some C code that should reproduce the problem.
Fixing this problem is of CRITICAL importance to the short and long
term success of our products on Linux. Any further help is GREATLY
appreciated.
Thanks,
Bill Waters
HighWind Software
===========================================================================
/*
file: test.c
This C program spawns 3 types of threads and lets them run.
The first simply loops reverse DNS'ing an IP address
The second loops forward DNS'ing a hostname
The third simply generates network traffic by reading from a neighboring
machine's chargen port
To compile & run:
1) Set the following constants:
*/
#define HOSTNAME "other.host.yourname.com" /* For forward DNS & chargen */
/* Best to use a machine besides */
/* the one on which we're running */
#define IP_ADDRESS "999.999.999.999" /* for reverse DNS */
#define CHARGEN_PORT 19 /* Probably don't need to change */
/*
2) Type: gcc -o test -lpthread test.c
3) Type: ./test
This crashes, usually generating kernel syslogs and/or crashing the box */
#include <arpa/inet.h>
#include <assert.h>
#include <netdb.h>
#include <netinet/in.h>
#include <errno.h>
#include <poll.h>
#include <pthread.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <unistd.h>
#define MAXHOSTENT 16384
static pthread_t spawnThread(void *(*function)(void *))
{
pthread_t tid;
pthread_attr_t attr;
/* sanity */
if (!function) { return 0; }
/* Initialize thread attributes */
assert(!pthread_attr_init(&attr));
assert(!pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED));
assert(!pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM));
assert(!pthread_create(&tid, &attr, function, 0));
assert(!pthread_attr_destroy(&attr));
return tid;
}
void *reverseDNS(void *arg)
{
struct hostent ent;
struct hostent *pRes = 0;
char *buf = (char *)malloc(sizeof(char) * MAXHOSTENT);
int err, IP;
while (1) {
IP = inet_addr(IP_ADDRESS);
assert (!gethostbyaddr_r((char *)(&IP), sizeof(int),
AF_INET, &ent, buf, MAXHOSTENT, &pRes, &err));
}
free(buf);
return 0;
}
void *forwardDNS(void *arg)
{
int err;
char *buf = (char *)malloc(sizeof(char) * MAXHOSTENT);
struct hostent ent;
struct hostent *pRes = 0;
memset(buf, 0, MAXHOSTENT);
while(1) {
assert(!gethostbyname_r(HOSTNAME, &ent, buf, MAXHOSTENT, &pRes, &err));
}
free(buf);
return 0;
}
void *generateTraffic(void *arg)
{
/* Look up the IP */
struct hostent hp, *pRes;
char *buf = (char *)malloc(sizeof(char) * MAXHOSTENT);
int err, fd;
struct sockaddr_in location;
char buffer[1024];
assert(!gethostbyname_r(HOSTNAME, &hp, buf,
MAXHOSTENT * sizeof(char), &pRes, &err));
memset(&location, 0, sizeof(location));
location.sin_family = AF_INET;
memcpy(&location.sin_addr.s_addr, pRes->h_addr_list[0],
sizeof(location.sin_addr.s_addr));
location.sin_port = htons(CHARGEN_PORT);
free(buf);
/* Create a socket & connect */
assert((fd = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) != -1);
assert(connect(fd, (struct sockaddr *)(&location),
sizeof(location)) != -1);
while (1) {
assert(read(fd, buffer, 1024) != -1);
}
return 0;
}
int main(int argc, char **argv)
{
int i;
for (i = 0; i < 5; i++) {
spawnThread(forwardDNS);
spawnThread(reverseDNS);
spawnThread(generateTraffic);
}
/* Sleep without generating signals */
poll(0, 0, 60 * 1000);
return EXIT_SUCCESS;
}
--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
debian-devel-request@lists.debian.org .
Trouble? e-mail to templin@bucknell.edu .
Reply to: