[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bind9 stopped after 34 days of uptime



Hi list, Here I am again and a happy newyear to all :)

----- Original message -----
On Wed, 25 Dec 2002 20:54:02 +0100 (CET)
Richard <richard@unixguru.nl> wrote in message
<20021225205127.W51437-100000@mail.unixguru.nl>:

> On Wed, 25 Dec 2002, J.Reilink wrote:
> 
> > I've had exactly the same on our corperate primary nameserver
> > (Slackware with bind 9.2.1), because there was no logging I couldn't
> > find out why bind stopped working.
> 
> Take a look at memory usage when Bind stop's working and monitor for
> some time how much memory Bind is using. If that amount is growing,
> Bind probably got a memory leak. ( isn't the first time :( )
> 

The .pid files stays small[1] and the memory usage does too[2]

# uptime
 11:43pm  up 179 days, 52 min,  1 user,  load average: 0.03, 0.07, 0.0

The first crash was 8 days ago, so it first crashed after an uptime of
(approx.) 171 days. In these last 8 days, Bind has crashed 3 times
without any specific reason.

Since the last crash yesterday I remembered CA-2002-19
<http://www.cert.org/advisories/CA-2002-19.html> and thought this might
be the problem. Since there was no logging[3] I'm still not sure, but
I've reconfigured syslog.conf to also log kernel messages, which it
didn't at first.

Our Bind version is not vulnerable (Bind 9.2.1) but perhaps our libc
version (Bind is dynamicly linked) is. It is, as far as I can tell,
version < 2.2.5 [4] and I did see some strange messages in dmesg about
``UDP: bad checksum''. In fact, this is the reason why I turned on the
kernel messages and installed tcpdump :)

Perhaps it's a DoS? To be honest, I'm waiting untill the next crash to
be sure (to see what the logs are telling me).

Since we don't use Debian :( it's rather offtopic on this list, but
perhaps it's interesting enough? :)

[1] /var/run/named# ls -la
-rw-r--r--    1 root     root            5 Jun  6  2002 named.pid

[2] according to ps aux and top, it stays around 6.2% and the proces is
running 5 times.

[3] There are some known issues with our secundairy nameserver, it
generates a lot of errormessages in /var/log/*. That's why logging was
off as much as possible.

[4] 	/usr/lib# ls -la |grep libc
	....
	-rw-r--r--    1 root     root      2347326 May 28 2001
	libstdc++-3-libc6.2-2-2.10.0.a
	-r-xr-xr-x    1 root     root       274724 May 28  2001
	libstdc++-3-libc6.2-2-2.10.0.so*
	lrwxrwxrwx    1 root     root           30 Jun  1  2002
	libstdc++-libc6.2-2.a.3 -> libstdc++-3-libc6.2-2-2.10.0.a
	lrwxrwxrwx    1 root     root           31 Jun  1  2002
	libstdc++-libc6.2-2.so.3 -> libstdc++-3-libc6.2-2-2.10.0.so

Guess these are the files I'm looking for...

Regards, Jan

-- 
/"\  ASCII Ribbon Campaign
\ /  No HTML in mail or news!
 X
/ \ 		DSINet: http://www.dsinet.org



Reply to: