[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

threads uses crash debian

We have been testing the Cyclone News router from highwind
(www.highwind.com) for the past month and a half. And it has been less
them stable. Highwind seems to have tracked it down to the problem that is
causing the problem that is discribed below. I ran the included program
and it did indeed crashed the box. Any ideas help or fixes would be
helpful. Please cc all replys to me as well as support@highwind.com

After literally a month of debugging and torture testing, I have isolated the
kernel/machine crash to a specific set of circumstances and produced the 
program below which reproduces it.

Here's the skinny:
Under moderate load, we have been seeing one or more of the following:
1) SEGV or Bus Error in some random piece of code
2) Total machine lockup
3) kernel messages similar to the following:

zonda kernel: general protection: 0000
Jan 28 15:00:28 zonda kernel: CPU:    0
Jan 28 15:00:28 zonda kernel: EIP:    0010:[tcp_ack+678/2244]
Jan 28 15:00:28 zonda kernel: EFLAGS: 00010206
lots of registers
Jan 28 15:00:28 zonda kernel: Call Trace: [udp_v4_rehash+62/104] [shm_close+163/236] [udp_v4_lookup+129/184] [sys_shmat+45/716] [move_addr_to_user+123/156] [do_signal+309/632] 

This has been reported on at least 4 different machines with a variety of 
different hardware, kernels and libc's up to and including 2.0.33 with 

It appears that the kernel and/or libc is having difficulty under the 
following circumstances:

1) High TCP load.
2) High DNS load.  Having one or more simultaneous outstanding DNS requests 
   seems to greatly increase the probability of a crash, though we HAVE
   reproduced it with only 1 DNS thread.
Below is some C code that should reproduce the problem.

Fixing this problem is of CRITICAL importance to the short and long
term success of our products on Linux.  Any further help is GREATLY 


Bill Waters
HighWind Software



 file: test.c

 This C program spawns 3 types of threads and lets them run.
   The first simply loops reverse DNS'ing an IP address
   The second loops forward DNS'ing a hostname
   The third simply generates network traffic by reading from a neighboring 
   machine's chargen port

  To compile & run: 
  1) Set the following constants:
#define HOSTNAME "other.host.yourname.com"  /* For forward DNS & chargen */
                                         /* Best to use a machine besides */
                                         /* the one on which we're running */

#define IP_ADDRESS "999.999.999.999"     /* for reverse DNS */
#define CHARGEN_PORT 19                  /* Probably don't need to change */

  2) Type:  gcc -o test -lpthread test.c 
  3) Type: ./test
    This crashes, usually generating kernel syslogs and/or crashing the box */

#include <arpa/inet.h>
#include <assert.h>
#include <netdb.h>
#include <netinet/in.h>
#include <errno.h>
#include <poll.h>
#include <pthread.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <unistd.h>

#define MAXHOSTENT 16384

static pthread_t spawnThread(void *(*function)(void *))
    pthread_t tid;
    pthread_attr_t attr;

    /* sanity */
    if (!function) { return 0; }

    /* Initialize thread attributes */
    assert(!pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED));
    assert(!pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM));
    assert(!pthread_create(&tid, &attr, function, 0));

    return tid;

void *reverseDNS(void *arg) 
    struct hostent ent;
    struct hostent *pRes =  0;
    char *buf = (char *)malloc(sizeof(char) * MAXHOSTENT);
    int err, IP;       

    while (1) {
	IP = inet_addr(IP_ADDRESS);
	assert (!gethostbyaddr_r((char *)(&IP), sizeof(int), 
				 AF_INET, &ent, buf, MAXHOSTENT, &pRes, &err));

    return 0;

void *forwardDNS(void *arg)
    int err;
    char *buf = (char *)malloc(sizeof(char) * MAXHOSTENT);
    struct hostent ent;
    struct hostent *pRes = 0;

    memset(buf, 0, MAXHOSTENT);

    while(1) {
	assert(!gethostbyname_r(HOSTNAME, &ent, buf, MAXHOSTENT, &pRes, &err));

    return 0;

void *generateTraffic(void *arg) 
    /* Look up the IP */
    struct hostent hp, *pRes;
    char *buf = (char *)malloc(sizeof(char) * MAXHOSTENT);
    int err, fd;
    struct sockaddr_in location;
    char buffer[1024];

    assert(!gethostbyname_r(HOSTNAME, &hp, buf,
                           MAXHOSTENT * sizeof(char), &pRes, &err));
    memset(&location, 0, sizeof(location));
    location.sin_family = AF_INET;
    memcpy(&location.sin_addr.s_addr, pRes->h_addr_list[0],
    location.sin_port = htons(CHARGEN_PORT);

    /* Create a socket & connect */
    assert((fd = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) != -1);
    assert(connect(fd, (struct sockaddr *)(&location),
                  sizeof(location)) != -1);
    while (1) {
	assert(read(fd, buffer, 1024) != -1);

    return 0;
int main(int argc, char **argv) 
    int i;
    for (i = 0; i < 5; i++) {

    /* Sleep without generating signals */
    poll(0, 0, 60 * 1000);

    return EXIT_SUCCESS;

TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
debian-devel-request@lists.debian.org . 
Trouble?  e-mail to templin@bucknell.edu .

Reply to: