[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Scary bugs



On Sat, 29 Jan 2000, Thierry Laronde wrote:
> On Fri, Jan 28, 2000 at 06:55:58PM -0500, Andrew Pimlott wrote:

[about hwclock in package util-linux]
> I've just downloaded the new version of the package. The problem is that
> something has been done, but this is the wrong thing !

Yes, I also agree that the fix is not the right one.

> So what /etc/adjtime is made for isn't used ! /etc/adjtime is supposed to

Which is a good thing, as --adjust is badly broken as is (read on).

> And the problem is that the "offending" line is still here :
[..]
> 		hwclock --systohc $GMT
[..]
> The result is that the hwclock is spoiled by the system clock if this one
> is not accurate, and that /etc/adjtime is modified by the instruction
> `hwclock --systohc'. In this case, there is no hope for someone modifying
> his hardware clock to have something working.
> 
> So this is definitively *not* an upstream problem. The bug sent by you 
> Andrew has been closed because something has been done, but IMHO what
> has been done is a mistake.

I disagree. It is also a upstream problem, and one not easy to fix.

(Stuff below according to the hwclock docs/man page. I did quick-read the
source, but I may have missed a few things)

--adjust is based on a (IMHO bad) premise: that *nothing* other than hwclock
will ever touch the RTC.  This isn't true in pure debian since we have the
ntp package, and this isn't true in the debian+local packages scenario
either.

The hwclock man page makes it very clear that if, for example, you have NTP
running in your machine (implies "11 minute mode" for the kernel), you must
*never* call --hctosys or --adjust, and shouldn't even call hwclock
--systohc as that disables 11 minute mode. This isn't documented in the
README.Debian of either ntp or util-linux.


Consider the follwing scenario, where hwclock --adjust; hwclock --hctosys
is called on startup.

-> User uses something other than hwclock to set the RTC, such as ntp, M$
   Windows, DOS or BIOS.

The drift file for hwclock is now invalid, and can cause severe clock
misadjustements if the error is cumulative and --adjust is ever used.

If the user set the clock in the BIOS/Windows/DOS, he's left wondering what
the heck happened... IF he notices the problem until it's too late, that is.

If ntp is being used, hwclock should *never* be used. If it is, and ntpdate
is not run on bootup, severe clock misadjustment might happen and ntp could
refuse to start. This should be better documented in proper, easy-to-find
README.Debian-like files, and I think I'll fill bugs against ntp and
util-linux regarding this after I give it a bit more thought (and read the
BTS do avoid duplicates :-) ).

Do notice that: if the user has RTC in local time and just entered/leaved
DST (by hand or because of Windows), and the date for DST entering/leaving
does not match exactly with that in the Unix timezones data, the drift file
will be wrong by one hour most likely and severe clock screw up ensues.

If you think this timezone mess is unlikely, well, it's been the _rule_ in
Brazil for a few years now. I wouldn't be surprised if this is a common
problem in many countries other than the USA, Canada and most of Europe...

I think the above is reason enough to never enable --adjust by default,
unless you teach hwclock to detect and act upon external tampering with the
RTC.


Here's a possible (and probably broken somewhere :-) ) strategy to do this.
It is not implementation-complete. For example, one would need to invalidate
the drift file, as well as implement the idea of a drift file that only has
the systematic drift stored, but not when it was last applied (I guess, I
didn't study the hwclock code).


1. Train hwclock so that it learns the RTC's drift without tampering. I'll
   call this a "training period" where the user is strongly advised (in a
   bothersome, impossible not to notice way) not to touch the RTC, and told
   to deal with the crap himself if he does change the RTC.

2. The RTC drift is reasonably constant (otherwise the RTC is useless).

   hwclock should refuse to change the RTC drift too much unless in training
   mode. How much is 'too much' must be found by some careful testing, and
   should probably be configurable.

   If the drift is too big, warn the user and do not change the sistematic
   drift in the drift file.

3. hwclock should refuse to run --adjust if the drift to be applied to the
   RTC is "too big" (quite likely after a few days of downtime), warn the
   user that the clock has gained/lost too many seconds/hours/days (just
   like Solaris does), and maybe request the user to verify and run
   something like hwclock --adjust --force-adjust; hwclock --hctosys
   manually.

This require hwclock to store the RTC training state, which could go in
/etc/adjtime. (1) also means this should not be the default for Debian if
hwclock is in the default list of packages (too much nagging the user during
first installation already, and (1) should not be done in automated
installs).  (3) makes sure the user is warned of clock screw ups, and has a
chance to do something about it.

Ah, let's not forget something important:

4. Teach hwclock about syslog, and allow it to be used to log the warnings.


The Debian util-linux package should:

   Warn the user that hwclock must be properly trained for the RTC. This
   means only hwclock can be used to set the RTC (and make it clear that
   this means no BIOS clock setting, no DOS clock setting, no Windows clock
   setting, no ntp...) for a day or two.

   Not enable --adjust in the rc scripts until sure that the user has read
   the above notice, and stated the "RTC training" period.

   Document all this crap clearly in README.Debian, explaining the issues
   with ntp, and any other RTC tampering.

Other debian packages which write to the RTC should:

   Be aware of hwclock training and warn the user, or disable hwclock
   somehow.


> We do need to break, *by default*, the vicious circle sets by the init script.

Which means leaving --adjust disabled in the init scripts right now. I am
not sure about --systohc in shutdown, but I'll leave that one for you :-)

Actually, I'd prefer hwclock to be in a package of its own, so that ntp (and
any other packages of the kind) could conflict with it.

It'd be nice to have hwclock as the backup for ntp, but that requires a lot
of tweaking. You need to train hwclock without ntp first. Then you must
freeze the drift file, which requires a new behaviour for --systohc that
only stores the 'last updated' time in the drift file but does not
recalculates the drift (because it'll be zero due to ntp)...

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh 


Reply to: