Re: Follow-up to Error Message...
Seth Kurtzberg wrote:
Have you run a hardware test on the system memory? I've seen something
very similar on a machine that had an intermittent memory problem. I
realize that the trace doesn't precisely suggest that; nevertheless I've
seen it happen (of course the specifics such as the LBA, etc. were
different). Especially since this seems to occur intermittently and,
apparently at different times (that is, it doesn't always happen at the
same % done) it is definitely worth testing the memory hardware.
The indication that once any memory error occurs, _every_ subsequent
operation involving memory allocation fails is suggestive. While it is
possible that they are all caused by the same thing (that is, there is
just one error propagating the error message up the stack), it doesn't
look that way to me. Flushing the cache, I _believe_, is a separate
Of course, it doesn't have to be hardware, it could be an o/s bug
related to memory handling. That also sounds improbable but it is quite
likely that this application stresses the memory handling code more than
any other. I would put my bet on hardware.
Apologies if this has already been suggested and I missed it.
Jeff Ross wrote:
No a very descriptive subject line, and I apologize for that.
When running the backup script I posted in the previous message this
morning from the command line, I got the following error message:
89.98% done, estimate finish Wed Jan 19 09:26:09 2005
90.36% done, estimate finish Wed Jan 19 09:26:09 2005
90.74% done, estimate finish Wed Jan 19 09:26:09 2005
91.12% done, estimate finish Wed Jan 19 09:26:09 2005
:-( unable to WRITE@LBA=1248c0h: Cannot allocate memory
builtin_dd: 1198272*2KB out @ average 2.2x1385KBps
:-( write failed: Cannot allocate memory
/dev/rcd0c: flushing cache
:-( unable to FLUSH CACHE: Cannot allocate memory
:-( unable to SYNCHRONOUS FLUSH CACHE: Cannot allocate memory
I actually got this 2 times in a row--the first time it failed at 67%
Curiouser and curiouser...
Thanks for the suggestion. Given the overall stability of this system:
jross@samba:/home/jross $ uptime
3:10PM up 208 days, 21:54, 1 user, load averages: 0.10, 0.10, 0.08
I'm going to have a hard time believing this is a memory problem. I
just downloaded and am running memtest now, so we'll see.