[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

New Test Tool



Hi,

We're just in the process of assembling a new i386 cluster here at Bilkent CS Dept.,
but we are not very sure about the quality of hardware which we have purchased
since the computing nodes are part of the research grant, but they have been used
for 2-3 days in a convention organized by our magnificent institute of science and
technology which very willingly gave us the grant. :) Whatever, our goal is to make
sure that all nodes are in good order. But, this means we're going to test each component
of the machine individually, and the idea of stress-testing a no-video-keyboard-mouse
computing node manually didn't sound very attractive to me.

Instead, I thought that the test process could be automated to some extent. I considered
implementing a client/server test system for the final cluster, which just uses some custom
TCP protocol, or I don't know, perhaps XML over http :), to transmit test request/replies. Well,
the request could contain which tests to perform, and the replies contain the results of
those tests. I think you could make it pretty much text-based using /proc and output of
familiar tools and logs. OTOH, it's not very easy to do that, so I decided to first do a very
lame hack with a test disk that boots a customized kernel and a root img, so that the node
starts up, tries to mount some nfs dir, performs some selected tests (like testing the hd) and
writes results back. It can write stuff to a simple file on the nfs mount. Initially, I thought that
testing the hd, and reporting on the status of network config is fine. If it boots up and
does that., it'd be fantastic.

Of course, the complete tool would be pretty handy. It could automate testing for clusters,
and increase reliability. It could mail the admin if things go weird, or if a certain expected
anomaly (!) arises, perform some correction operation (okay, the simplest I can think of is
re-installing everything on that node automagically if it doesn't respond at all to our test
server!!) The disk is a good idea too. Like an install disk, a test disk would boot up itself.
It could use BOOTP, (or RARP?) to config network, and then it would contain the test client
or perhaps a stripped down test client I should say, and go with it. The server package
could have an option to create a test disk with desired tests on it.That might be a neat hack,
and I'd really like to see it the Debian way.

Comments welcome.

Indeed, some feedback would be appreciated. Apparently, I need some advice on what
features are necessary, what other programs such a program could use or has to interface
with, or whether there is any need for this hack in the first place.

Thanks,



-- 
 ++++-+++-+++-++-++-++--+---+----+----- ---  --  -  - 
 +  Eray "eXa" Ozkural                   .      .   .  . . .
 +  CS, Bilkent University, Ankara             ^  .  o   .      .
 |  mail: erayo@cs.bilkent.edu.tr                .  ^  .   .
 
begin:vcard 
n:Ozkural;Eray
tel;home:4276846
x-mozilla-html:TRUE
org:Bilkent University;CS
version:2.1
email;internet:erayo@cs.bilkent.edu.tr
title:Graduate Student
adr;quoted-printable:;;Simsek Sok. No:6/2=0D=0AMeltem Apt. Asagi Ayranci	;Ankara;;06540;Turkey
x-mozilla-cpt:;10112
fn:Eray Ozkural
end:vcard

Reply to: