[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: computer freezes, no obvious reason



On Thu, Oct 11, 2007 at 10:58:36AM -0400, Kaloyan M. Penev wrote:
> Hi,
> 
> I recently installed debian stable. Immediately after that I downloaded
> and compiled myself the latest kernel version: 2.6.22.8. Since my machine
> is quad core I enabled SMP and other things that seemed appropriate. The
> system boots up and works almost fine, however I am experiencing the
> following weird problems:
> 
> When I try to compile something with g++ every now and then I get the
> following message:

compiling is a great way to find hardware problems. my first thought
is bad memory. Use a systematic approach to test each stick in each
slot. Lots of people recommend memtest, but I find it doesn't always
find the errors. 

> 
> ./matrix.cpp:602: internal compiler error: Segmentation fault
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <URL:http://gcc.gnu.org/bugs.html> for instructions.
> For Debian GNU/Linux specific bug reporting instructions,
> see <URL:file:///usr/share/doc/gcc-4.1/README.Bugs>.
> The bug is not reproducible, so it is likely a hardware or OS problem.
> 
> Also because I had to install the fglrx driver from the radeon web page
> (v 8.41.7), which for some reason requires libstdc++5, and everything else
> requires libstdc++6 when I compile opengl applications I get a warning
> that there may be possible conflicts between libstdc++5 and libstdc++6 and
> sometimes the compiled executable works and sometimes does not.
> 
> Also every now and then everything just freezes and the only thing left
> for me to do is press the reset button. Looking at kern.log and syslog
> shows nothing suspicious around the time this happens. I did search
> through kern.log and syslog for suspicios messages and I have attached all
> I found below (perhaps there are more):
> 
> kern.log:
> potato kernel: No NUMA configuration found
> potato kernel: Faking a node at 0000000000000000-000000007fef0000
> potato kernel: swsusp: Registered nosave memory region: 000000000009b000 -
> 000000000009c000
> 
> potato kernel: usb 2-2: reset high speed USB device using ehci_hcd and
> address 2 (this repeats many times while running, every few minutes if I
> use the device)
> 
> potato kernel: ClientGUI[7156] general protection rip:2b6259cf66ce
> rsp:7fff534bf4c0 error:0
> potato kernel: ClientGUI[7296]: segfault at ffffffffffffffe8 rip
> 00002b9d486c76d4 rsp 00007fff64aecaf0 error 4
> potato kernel: ClientGUI[8543] general protection rip:2b2c244416ce
> rsp:7fff88d72d80 error:0
> potato kernel: ClientGUI[8706]: segfault at 0000000000000110 rip
> 00002b363e36c6ce rsp 00007fff6ee47e50 error 4
> (where ClientGUI is an opengl application that I am writing, uses
> wxwidgets, gl and glu)
> 
> Oct  4 15:03:26 potato kernel: WARNING: at fs/buffer.c:570
> __remove_assoc_queue()
> Oct  4 15:03:26 potato kernel:
> Oct  4 15:03:26 potato kernel: Call Trace:
> Oct  4 15:03:26 potato kernel:  [<ffffffff802991b6>]
> drop_buffers+0xa4/0xec     Oct  4 15:03:26 potato kernel:
> [<ffffffff80299255>] try_to_free_buffers+0x57/0x
> 9b
> Oct  4 15:03:26 potato kernel:  [<ffffffff8026030f>]
> shrink_inactive_list+0x4ed/
> 0x83a
> Oct  4 15:03:26 potato kernel:  [<ffffffff8025eef4>]
> __pagevec_release+0x19/0x22
> Oct  4 15:03:26 potato kernel:  [<ffffffff8025fc40>]
> shrink_active_list+0x483/0x491
> Oct  4 15:03:26 potato kernel:  [<ffffffff804a7c8c>]
> thread_return+0x0/0xdb     Oct  4 15:03:26 potato kernel:
> [<ffffffff80260750>] shrink_zone+0xf4/0x11d
> Oct  4 15:03:26 potato kernel:  [<ffffffff8026116e>] kswapd+0x2da/0x489
> Oct  4 15:03:26 potato kernel:  [<ffffffff8023e10e>]
> autoremove_wake_function+0x
> 0/0x2e
> Oct  4 15:03:26 potato kernel:  [<ffffffff80260e94>] kswapd+0x0/0x489
> Oct  4 15:03:26 potato kernel:  [<ffffffff8023dfee>] kthread+0x47/0x75
> Oct  4 15:03:26 potato kernel:  [<ffffffff8020a378>] child_rip+0xa/0x12
> Oct  4 15:03:26 potato kernel:  [<ffffffff8023dfa7>] kthread+0x0/0x75
> Oct  4 15:03:26 potato kernel:  [<ffffffff8020a36e>] child_rip+0x0/0x12
> 
> 
> The same messages repeat of course in syslog.
> 
> The last simptom that I notice is that when I mount an external SATA over
> USB disk it starts reading from the disk and it takes it about 30 sec to a
> minute to stop (the same exact disk does not behave that way when I plug
> it into another computer running debian).

I don't think this is necessarily indicative of anything unless you
know you've got the same packages installed on both machines, although
that USB message above may point to a flaky controller or driver. Look
at the package differences between the two systems first, as that's
fairly easy to compare. Tail -f the logs while using that disk to see
if those errors crop up during that spin-up time and try plugging into
a different USB port to see if you can get on a different controller
to start isolating that as potential problem (lsusb helps here). 

Finally, don't discount the possibility of a failing power supply. A
failing power supply can wreak subtle havoc everywhere.

A

Attachment: signature.asc
Description: Digital signature


Reply to: