Stable computing hardware (was: Re: AMD vs. Intel)
Most of this is stuff I gathered from various fonts, and some of my nearly
forgotten experience with designs for highly-stable and durable generator
static field controllers.
On Wed, 07 Apr 2004, Christian Schnobrich wrote:
> On Mit, 2004-04-07 at 12:44, martin f krafft wrote:
> > I am a huge fan of AMD, not only because their processors are
> > cheaper.
> > Recently, however, I have experienced random crashes on two machines
> > that run AMDs. The crashes seem to be related to IO and happen
Let me guess: VIA chipset? I have a A7V motherboard that does the
same, unpredictably. The PCI bus just hangs the entire machine. After that
one, I tried to learn a thing or two about common consumer computer
Now I know why we paid US$1000 for i386-based industrial controlers to run
the generators ;-)
> I have no first-hand experience, but from stories I heard it seems that
> Intel has some benefits over AMD. Stories like USB stuff going haywire
> after every two-dozen (dis)connects and other more or less obscure,
> hard-to-track-down issues that might well never turn up in home use.
This is probably related to the chipset or motherboard, and not the CPU
itself. AMD CPUs (and chipsets made BY AMD) are good stuff.
Decide if you want performance or stability, and buy your system (and
components) accordingly. If you need easy to find stable systems for mostly
desktop use, you will probably have to go with an Intel motherboard and
Intel CPU. If you can hunt for uncommon hardware, you'll find very stable
AMD solutions, too.
If you get a motherboard with one of those "for gamers and casual users"
chipsets (i.e. not *made* by AMD, Intel or ServerWorks AFAIK), don't expect
a computer that never flips bits at the wrong time. If you get a
motherboard from someone that specializes on boards for gamers, you will get
a "gamer" board no matter what they tell you.
The same goes for your memory modules. A rule of thumb is that if it
doesn't have a lifetime warranty, it should not be anywhere close to a
computer that cannot crash. Also, be careful with "extreme performance"
memory modules: being conservative is good for continued stability. For
anything above 256MB of RAM in the system, get ECC modules. People don't
seem to grasp just the amount of luck they need not to get a bit flip with
today's memory sizes, if they leave the computer active for any extended
period of time.
ECC memory is extremely more resilient to corruption. It WILL experience bit
flips as often as common memory, obviously... But you need two bit flips *in
a certain area* (that must happen before the affected area is accessed
again), to get memory corruption. That is far more unlikely to happen.
Really bad electric or eletromagnetic noise will defeat ECC memory, though.
You need a top-notch power supply and good cooling too, of course. Most
power supplies aren't adequate for non-error operation. You have to
handpick them. And the good ones ain't cheap.
Bad power supplies will KILL your system. Usually the memories go first,
then the disks, and finally the motherboard and PCI cards. This can be
silent, as power flutuations can degrade the characterisitics of components
throughout the system slowly. Eventualy it gets to a point where something
important gives in, causing a crash. You can fix whatever broke, but the
system as a whole will never be as stable as it once was.
I suggest premium top-of-line power supplies from specialized vendors (that
come with active PFC, and if at all possible, *without* variable-speed fans.
Unless it has a PWM closed loop fan control). They're (usually) better on
the electric noise department. If you can find a power supply that has a
PWM closed loop for fan control, please tell me, I want one :-).
The problem is that these power supplies are often noisy as all heck... and
they can cost 10x more than the usual crap power supplies you find
everywhere. Yes, there are very good power supplies that don't cost as
much. Good luck finding one, and get your power-supply validadion
workbench, with a non-linear load tester and a osciloscope ready...
Aged eletronics are also a no-no in modern computers. I have seen old XT
computers last for 10 years (as long as you didn't power them down). New
machines seem to need a electronic component overhaul after 3 years to keep
their stability -- high power drain and highly tight timing constraints come
with a price. And I don't see tantalus capacitors in motherboards often;
electrolytic capacitors are pretty much timebombs with a short fuse.
In short: AFAIK, unless you take great pains (in either time or money), what
you will get when you buy a common computer is pretty much a product that
will work right 99% of the time if it is a "top of line gamer product", or
60% of the time if you go with the really crap cheap-o products for the
"value customer". It will work for no more than 3 years before it degrades
noticeably. It doesn't matter what CPU it uses, usually.
If your system uses so-so components, downclock the system and relax the
memory timings (downclocking the PCI bus can cause all sorts of stability
problems depending on the specific components and PCI boards conected to it,
test it first).
Schedule a 24-hour memtest+ run every month. But try to get good memory and
an almost good power supply. Nothing will save you from constant crashes if
these two are el-cheap-o crap.
> It could also be that some chip is insufficiently cooled, perhaps the
> NIC that is mounted in the lowest slot. And so on. The list is endless.
Indeed. Get top-quality components, if you want stability. Don't go with
first revision chipset or CPUs, they often have very scary errata. And
never push the system.
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot