[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Device detection?



Matt Kern <mwk20@cam.ac.uk> writes:

> Checkpointing every n devices could cause valid devices to be missed
> and should therefore probably be avoided.  Basically it boils down
> to having to checkpoint for every device.

Try using binary search.

If you know that you just rebooted from failure, and you know what the
last sucessful test and the number of tests per-checkpoint was, then
you can easily isolate (perhaps with a few extra reboots) what the
offending test was.  This is a nice approach because it makes things
quite fast when there are no problems, and guarantees no more than
log_2(n) reboots per bad device, where N is the initial number of
tests between checkpoints.  Presuming that most people won't have
lockups, the overall inconvenience might be less this way,
particularly if it dramatically cuts the time to run successful tests.

-- 
Rob Browning <rlb@cs.utexas.edu> PGP=E80E0D04F521A094 532B97F5D64E3930


Reply to: