[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Memory Black Hole



Hi all,
The subject of this message could have been 'Memory Leak' but honestly
that doesn't sound dramatic enough for my problem.

Warning, this turned out to be a longish email, for the impatient,
please skip to the section marked SUMMARY at the bottom.

Last summer (2004) I installed Debian Woody on a P3 to use as a
development platform for PHP and MySql.  Everything worked perfectly. 
No complaints.

Then, three weeks ago, a very odd problem started to plague me.  At
first I saw only the symptoms.  I was editing php pages, and suddenly
vim wouldn't start.  It complained about being unable to load linked
libraries.  I checked my library file (it was fine).  The next time I
tried vim it worked fine.  I was mystified.  The symptom recurred.  I
upgraded to version 3.1 (sarge) with apt-get upgrade.  The process did
not go smoothly, halting several times when various libraries were
unable to be loaded, mostly on execution of a perl script.

I muddled through the upgrade and rebooted the computer.  From this
point on I had no problems with loading linked libraries.  The
computer rebooted fine and all services worked as before.  I started
KDE and the computer almost ground to a halt.  top showed me that all
256 megs of ram were used up.  I quit KDE and unloaded X.  No change.

I brought down: Apache, Apache-ssl, Exim, pop3ad, mysqld, smbd.  no
changes.  I actually ended up halting every process except for: kernel
processes, 5 getty's, 1 bash, 1 top.  The memory was not freed.  I
unloaded every kernel module (mostly iptables and ethernet device
drivers), the memory was not freed.

I reboot the comuter so that I can monitor the memory usage from a
blank slate.  This is the behaviour I noticed:  Every five seconds or
so a chunk of memory averaging between 4k and 16k would be allocated
and never freed.  I watched long enough to asure myself of the
pattern, then started killing of processes one by one.  After each
process exited, the memory was still being leaked.  I continued until
I was down to the bare processes listed above, and no modules.  In
less than two hours, all my memory was gone, save for 4 or 5k.  The
leaking stopped there, not cutting into swap space.

At this point I think one of three things has happened.  There is a
memory leak in the kernel (I have 2.4.18 at this point, not likely). 
Or there is a hidden process gobbling up all my memory.  Or there is a
physical problem that is preventing memory from being deallocated.  I
briefly considered the possibility that top was reporting memory usage
incorrectly, but a brief inspection of /proc (especially /proc/mem,
more on this later) showed that this was not the case.

I began with a tool to show hidden processes, psreal.  It found none. 
I downloaded kernel sources for 2.6.8, configured and built that
kernel, figuring this would both A) show hidden processes missed by
psreal and B)  solve my unlikely problem of a kernel memory leak.

Although the 2.6.8 kernel had no better luck holding onto the memory
than the 2.4.18 kernel did, there was a change in behaviour; I was
losing memory in 60k chunks now.  This seems like an important clue,
but I can't decipher it.

What I've Done in My Futile Attempts to Diagnose or Repair this Problem
*)
To test the kernel's ability to deallocate memory, I wrote a program
that looked something like this:

void main() {
  char *p;  int i = 1024;
  while (p = malloc(i)) i += 1024;
}

There was a bit more to it than that, but I didn't write or access the
malloc'd memory in any way.  This showed expected behaviour, quickly
gobbling up memory and then returning it with it exited or was killed.

*)
I dumped /proc/mem into a file on a workstation and opened it up in a
hex editor, to examine all 256 megs of data.  So, this is interesting.
 A very important clue it seems like that I am incapable of
deciphering.  I am finding that some files from the root filesystem
have been inserted into the memory.  A tarball from one users
directory appears twice in main memory.  /etc/passwd appears 11 times
in memory!

The root partition is reiserfs.  The boot partition, which is always
mounted, is ext2.  There are no files from /boot in memory that I can
find.

---------------------------
SUMMARY:
Files or chunks of files from the root (reiserfs) partition are being
inserted into memory at the rate of 4-16k/5 secs (2.4.18) or 60k/5
secs (2.6.8).  This memory is never freed.  This insertion is not
being caused by any user space program.  If the only programs running
are kernel processes, getty, bash, and top, it will still occur. 
Memory will be eaten up until about 5k is left, and then it
stabilizes.  Swap space will not be used.  This behaviour occured
under Debian Woody and Sarge.  Sarge was tested with kernels of
version 2.4.18 and 2.6.8.

Does anyone have any idea what could possibly be causing this?  Even
advice to other references would be greatly appreciated.



Reply to: