[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

no buffer space available



Hi!

We are small ISP, and we are experiencing some strange problem with
our two PPPoE servers. They are running Debian Etch, all software from
official repositories. Our management prefers Gentoo Linux
distribution, so as soon as they will be aware of this problem they
will blame Debian at first. ;-)

The machines have identical hardware: nvidia-based motherboard (some
desktop shit, nobody asked us when hardware were bought), realtek 8139
network card, amd 64 dual core processor, 2 Gigs of RAM, SATA on board
(nvidia mcp51). Each has about 50 network interfaces (vlan), with /24
private networks on each. Services include dhcp, bind, pppoe in kernel
mode.

Number of PPPoE interfaces is about 500 on each, with 10 Mbit/s
average traffic. They are running fine for some time (week or two),
then following lines appear in system log:

Mar 20 11:45:57 pppoe1 named[4267]: client 10.1.67.154#1049: error
sending response: not enough free resources
Mar 20 11:45:58 pppoe1 named[4267]: client 10.1.55.135#1164: error
sending response: not enough free resources

Also, kernel writes:

Mar 20 11:49:45 pppoe1 kernel: Neighbour table overflow.
Mar 20 11:49:50 pppoe1 kernel: printk: 26 messages suppressed.

When i try to do a 'ping localhost' or 'telnet localhost 22' these
commands most of time fail with error 'no buffer space available'.
Strace shows that 'connect' system call fails with ENOBUFS error code.
After several attempts command may succeed but then again fail. What's
strange, when i try to ping some neighbour routers, ping and telnet
work at every attempt.

I have googled a lot, but have not found anything useful - most posts
are about freebsd, when these happens on linux machines some suggest
that there may be a problem with loopback not configured (that is not
the case.. btw, on loopback interface there is additional real
ip-address serving as server side of pppoe connections), or some bad
network cards.

When problem appears, only reboot fixes it. I tried to shutdown all
processes except sshd in hope that some process has associated kernel
structures that can be freed after process shutdown. No luck.

At this time, I suspect that this is kernel issue (may be specific for
our unfortunate hardware).

Any hints will be greatly appreciated. I am ready to provide any
additional info, if requierd. If the problem will not be solved,
sooner or later we will be forced to reinstall Gentoo Linux, which
imho will not help.. :-/

Kernels are custom compiled kernels from kernel.org, not debian stock.
Versions are 2.6.20.1 and 2.6.18.2.

--
Timur Irmatov, xmpp:irmatov@jabber.ru



Reply to: