RFC: proposed changes to hwclock
Hello Developers,
This RFC (Request For Comments) tries to deal with the issue of drift
correction and timekeeping robusteness for "woody", when applied to Debian's
"default" RTC (hardware clock) control utility: "hwclock" from the
util-linux package.
Please read it through, think about the issues it tries to address and if
the proposed patches would actually help or not, and reply with your
suggestions.
I'm making the assumption that Debian would want for woody a reasonably
user-friendly "poor's man RTC drift compensation" either active by default,
or easy (as long as the user RTFM -- but we tell him exactly where it is) to
setup. This is based on the number of people complaining about "my clock
keeps gaining/losing time", and the fact that "hwclock --adjust" was
actually activated by default for a while.
The RFC will (ammended with any suggestions sent to me/discussed in -devel)
be filled as a wishlist bug against util-linux (and I hope, forwarded
upstream by the util-linux Debian maintainer) in a few days. I currently
don't have the time to actually implement the changes proposed, anyone
willing to create some code is more than welcome to.
RTC is used in this RFC to mean the same as "hardware clock" in hwclock's
documentation. Suggested modifications are grouped by the target problem
they're supposed to (help) solve.
NOTE: I am not intimately familiar with the hwclock source, so a few of the
modifications below might be already there, or need tweaking.
Group 1: Other aproaches at RTC sistematic drift computation must be made
possible
Reasoning:
hwclock's default drift computation is severely lacking in various ways,
the most obvious being that it is extremely sensitive to external
tampering with either the system date or the RTC. This is a problem,
because (Debian) users are being hit by this limitation a lot.
Currently, it is completely impossible to use another tool to do the RTC
systematic drift error computation for hwclock's use because hwclock
will overwrite the results.
RTC drift calculation can be done in a much safer way by other
utilities, such as ADJTIMEX, CHRONY and even NTP. hwclock's main
function is to access the RTC, and that's where it excells; we should be
able to use other tools which excell in precise timekeeping to calculate
clock drift.
Modifications to implement:
1. hwclock should be configurable to CHANGE DRIFT and NOT CHANGE DRIFT
(change or not change the RTC drift in /etc/adjtime). NOT CHANGE DRIFT
should effectively turn the RTC drift a read-only value for hwclock.
Do note that /etc/adjtime would still be changed, as the other
variables in there must still be updated.
IMPORTANT: When the RTC drift is in NOT CHANGE DRIFT state, hwclock
must *not* compute a new drift. It must use the drift in /etc/adjtime
if it has to do any drift correction. This is especially true for the
--adjust function.
2. (partially Debian-specific): The default (user has not given any
special options in hwclock's command line) must be made "NOT CHANGE
DRIFT", either by changing the default behaviour for hwclock (requires
agreement of upstream), or by introducing a /etc/hwclock.conf file
where defaults can be kept.
The second option (a defaults file) is the better choice, as it allows
users who still want to use hwclock for drift computation to set the
default to CHANGE DRIFT without recompiling (in Debian).
Group 2: hwclock should be more robust when applying the drift correction.
Reasoning:
The modifications in this group should reduce the number of incidents,
pleas for help in the mailing lists and general annoyance regarding
hwclock and corrupted/ incorrect drift corrections. Those drift errors
are caused by a number of factors, including but not limited to
misbehaving system clocks, human error and interaction with ntp,
ntpdate, date, chrony and other programs.
Improved robustenss is required if the modifications in group 1 is to be
of any practical use for Debian.
Modifications to implement:
1. hwclock should be able to refuse to apply drift corrections which
exceed a given absolute value. Note that "drift correction" means the
total drift as given by |(drift ratio)x(time elapsed since last RTC
update)|.
This absolute limit should be configurable and enabled by default,
using a reasonable default limit (such as 30 minutes).
Setting the absolute drift limit to zero should revert hwclock to the
unlimited drift correction behaviour.
hwclock should issue a warning and not perform write operations which
use the drift correction if the absolute drift limit is violated
(e.g.: --adjust).
2. (optional, but recommended)
If supported and required by the underlying kernel, hwclock must
detect that the kernel is syncronized with an external source and
updating the RTC by itself (the so called "11 minute mode" in Linux)
if any operation which would access (read or write) the RTC is
requested.
Should this condition be detected, hwclock must reset the "drift last
applied to RTC" variable in /etc/adjtime to "unknown" (or to the
current time, if unknown is not supported). It should also set the
drift to zero if in CHANGE DRIFT mode (and issue a warning if in
CHANGE DRIFT mode).
Should this condition be detected in a RTC or system clock write
operation (--adjust, --hctosys, --systohc...), a warning must be
issued and the operation must not be carried out. A command line
option to force the operation to be carried out should be provided.
(For Debian: Debian shutdown scripts should force the RTC update
regardless of the kernel RTC update mode).
3. hwclock should warn the user of possible /etc/adjtime corruption
should the absolute drift ratio stored in the file (or calculated if
in CHANGE DRIFT mode) be larger than a given absolute limit, in a
similar way as explained for the total drift correction in patch 1 of
group 2 above.
Should this condition be detected, the RTC drift should be set to 0 if
hwclock is in CHANGE DRIFT mode, a warning should be issued and the
time of last RTC drift correction set to unknown/now.
Group 3: hwclock's builtin RTC drift computation should be made more robust.
Reasoning:
If it's there, someone will want to use it. This function is not
hwclock's main function and it is severely limited by design, but
since it is there we should at least add a few sanity checks to make
it safer.
Modifications to implement:
1. Detect sign changes in the computed drift _when_in_CHANGE_DRIFT_mode_,
by comparing the new calculated value with the old one in
/etc/adjtime.
Should this happen, hwclock should issue a severe warning and
invalidate all drift correction data (zeroing drift, setting the
RTC-last-updated time to unknown/now).
[A 'normal', undisciplined RTC won't change its systematic drift
direction unless it breaks down, or its drift is too near zero. RTCs
which (knowlingly or not) do change their systematic drift like this
(automatic compesation?) are incompatible with hwclock's model of
drift adjustment. Still, a command-line option to disable this test
might be provided just in case]
2. Detect too large variations in the computed RTC drift _when_in_CHANGE_
_DRIFT_mode_ by comparing it to the last calculated drift in the drift
file (/etc/adjtime).
The limit against which the variation of the RTC drift is compared
(ie. how much is 'too large') should be configurable, with a
reasonable default (e.g.: 10s/day). Zero disables this test.
Should this contidion be detected, a warning must be issued, and
all drift correction data must be invalidated.
--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
Reply to: