[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: hard crash on leap second



We're getting a little OT here, but let's carry on. It's fun ...

On Fri, Jan 02, 2009 at 12:23:18AM +0100, Vincent Lefevre wrote:
> On 2009-01-01 11:58:14 -0700, Paul E Condon wrote:
> > IMHO, chrony's current behavior is already 'correct'. To me, it is
> > far more important that the reported time is always increasing than
> > that it quickly settles into synchronism with a source that exhibits
> > sudden jumps or extended periods of stasis.
> 
> I agree that sudden jumps or extended periods of stasis are bad.
> However this is how leap seconds currently work (it would have been
> better to have a continuous synchronization between UTC and UT1[*]).
> The consequence is that a machine using chrony can have 1 second
> difference with other machines. When the files are stored remotely
> (e.g. on a NFS server), this can yield problems, especially with
> tools like "make" (even though one doesn't like tools based on
> timestamps).
> 

When doing a build on a remote host I think one should use time stamps
that were generated on that remote host. It strikes me a very risky to
have a bunch of build objects scattered across a world wide array of
hosts and relying on locally generated timestamps to make sure they
are in sync. It seems to me that there must be a central authority, a
single host that assigns timestamps to all shared files. This is
doable, and if done, the local time on any single machine is
unimportant, and can be wildly wrong, and the system still works.


> [*] According to Wikipedia, a vote towards this solution was planned
> in 2008. But Wikipedia is not up-to-date.
> 

I think there are fundamental, irreconsilable differences. Atomic clocks
and the Earth actually run at different rates, and the difference is 
growing with time. It may be possible to hide one of the time standards
so that its existence is known only to a special few, but that really
won't solve the problem because people who know only about the visible
system will persist in designing stuff that can only work if there is
a single, universal definition of time, which there is NOT.

> > Real time clocks are a quantized form of time, with the quantum
> > being the period of the cpu 'clock'. Reporting system event times to
> > a precision of 1e-9 sec, as in done in the kernel, is crazy. The
> > last digit of that 'time' is surely not accurate.
> 
> Isn't the goal is to avoid equalities or to have accurate *relative*
> times on a machine?

For finding pairs of files that are actually identical, one should use
a message digest such as md5sum. Time stamp is used only because it's
easy. Isn't it possible to specify that timestamps be preserved in a
file copy? A preserved timestamp is surely not rewritten during copy
to compensate for a mismatch in the time settings of the clocks on the
two computers. So in a system that uses timestamps, if the timestamps
differ, then, at the very least the copy was not done correctly.

In comparing timestamps there is a shoddiness in the kernel. If a file
is open, its timestamp is reported in kernel-time which is carried to
a precision of 1 nanosecond (1e-9sec). This precision is actually
meaningless, but at least it discourages developers from testing for
equality. When the file is closed, the reported time stamp is the 
number that is recorded in the inode on disk, currently accurate to
a whole second. I call the kernel shoddy because its precision is
only for appearance. The reality is that the kernel developers have
no way of knowing how accurate this time actually is because they
have no independent measure of time with which to compare it at the
nanosecond level. So open vs closed affects the timestamp in a way
unknowable to the user. 

For me, the first goal is to have a timestamp system that is self-
consistent on a single host. Jerking the clock setting around because
of transient (mis)information from a remote clock obtained over a
noisy channel doesn't improve the measurement of local time. Once
there are reasonable measures of local time on all hosts, it becomes
reasonable to think about synchronizing these measure, but only to
the precision that is meanful, given the noise in the communication
channel. For time data, jitter in the transmission delay of messages is
noise. 

I'm really not as cranky as I seem ;-)

-- 
Paul E Condon           
pecondon@mesanetworks.net


Reply to: