[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ntp problem: the server clock slowly recedes



On 8/9/2011 9:11 AM, owl700@gmail.com wrote:
> 2011/8/9 Stan Hoeppner <stan@hardwarefreak.com>:
>> On 8/9/2011 6:28 AM, owl700@gmail.com wrote:
>>
>>> Hi Stan, thanks for your reply, we have a bare metal host hp dl580g7 no VM guest
>>>
>>> After this post we have add nomodify at localhost lines and offset
>>> seems acceptable now
>>
>> It *seems* acceptable because you just restarted ntpd.  Watch 'ntpq -p'
>> for 3 days and those numbers will be right back up they were.  I have
>> one old self built 2-way SMP system (non NUMA) that keeps incredibly
>> accurate time, currently up for 43 days, due to new kernel installation.
>>  It was up for 181 days prior to that.
> 
> Berfore inserting nomodify the clock lost 2 minutes in 1 hour, now is
> stable, 

That's the rate of change you see right now, but it may or may not
remain constant.  I'd give it a few days before stamping it as "stable"
or "problem solved".

> i can't hunderstand how can localhost modify ntpd but this was
> the only explanation, can you help me hunderstand?

Well, "nomodify" prevents local *programs* from reconfiguring your ntpd,
such as ntpq, ntpdc, ntpdate, etc.  But this doesn't tend to explain the
drift correction of 2 minutes every hour simply by enabling 'nomodify'.
 That would mean, basically, that some local program is modifying ntpd
every hour by 2 seconds.

Speaking of other tools, what does "ntpdate -q xntp1.inrim.it" return?
It'll tell you how far off you are in seconds.  I can't remember if
offset and jitter reported by nptq are in seconds or milliseconds.

>      remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
> xntp1.inrim.it   .CTD.            1 u   22   64  377   23.785  432.491 134.794
> *ntp2.inrim.it   .CTD.            1 u   32   64  377   25.653  221.217 130.149
> +alarich.online- 192.53.103.108   2 u   39   64  377   67.928  300.899  75.004
> +ntps1-0.cs.tu-b .PPS.            1 u   41   64  373   57.509  217.705 117.816
> 
>>
>>> # Local users may interrogate the ntp server more closely.
>>> restrict 127.0.0.1 nomodify
>>> restrict ::1 nomodify
>>>
>>> mcube@ssh-player:~$ ntpq -p
>>>      remote           refid      st t when poll reach   delay   offset  jitter
>>> ==============================================================================
>>> *ntp1.inrim.it   .CTD.            1 u  131   64  376   23.940  478.163 144.085
>>> +ntp2.inrim.it   .CTD.            1 u    6   64  375   25.877  540.437 173.659
>>> +alarich.online- 192.53.103.108   2 u   64   64  377   70.188  511.878 145.782
>>> +ntps1-0.cs.tu-b .PPS.            1 u   60   64  337   57.674  513.015 167.303
>>>
>>> Do you think there is a HW clock problem?
>>
>> My old 2-way:
>>
>>  remote         refid      st t when poll reach   delay   offset  jitter
>> ========================================================================
>> *navobs1.wustl.e .GPS.      1 u  452 1024  377   63.322   -3.093   1.269
>> +ntp.okstate.edu .USNO.     1 u  135 1024  377   73.731    1.812   1.889
>> +tick.uh.edu     .GPS.      1 u  759 1024  257   78.868    1.829   1.288
>>
>> Linux greer 2.6.38.6 #2 SMP Tue May 17 23:54:39 CDT 2011 i686 GNU/Linux
>>  06:47:26 up 43 days,  4:56,  2 users,  load average: 0.01, 0.04, 0.06
>>
>> I run a custom kernel, rolled from vanilla kernel.org source, on this
>> Squeeze system.  It has no time keeping tweaks nor custom kernel boot
>> parameters.  Note my offset and jitter compared to yours.  I've had
>> uptimes of 300+ days with offset and jitter no different that at 1 day
>> of uptime.
>>
>> To answer your question, you need to determine if the Linux kernel is at
>> fault.  Try a different clock source.  Clock sources can be changed at
>> runtime by writing the new clocksource name to the file
>> /sys/devices/system/clocksource/clocksource0/current_clocksource
>> but be aware that changing to an unstable/broken clock source can hang
>> the system. Changing tsc or jiffies to acpi_pm should be okay. (The list
>> of available sources is in the file available_clocksource in the same
>> directory.)
> 
> I think this isn't a good idea in my case i can't access the server
> directly if it hangs
> I had change in ntp.conf ntp servers but same problem

No, probably not, given your circumstances.  You do need to know if this
is an ntp configuration problem or a kernel clock drift problem though.

>> If changing the clock source helps, you'll need to add a boot parameter
>> to grub to make it permanent.  See section "Timer-Specific Options" in
>> this document:
>> www.kernel.org/pub/linux/kernel/people/gregkh/lkn/lkn_pdf/ch09.pdf
>>
>> Given that this is a DL580 system I doubt it's faulty hardware, although
>> it's possible.  Assuming the hardware is ok, I'm surprised a default
>> Debian kernel won't keep time accurately on it.  Google didn't spit back
>> any such Debian+DL580 clock issues...
> 
> I have a hp DL360 with same kernel, same squeeze 64bit etc. and there
> aren't any issues with ntp how you can see
> 
>      remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
> *ntp1.inrim.it   .CTD.            1 u  168 1024  377   24.407   -0.040   0.189
> +ntp2.inrim.it   .CTD.            1 u  580 1024  377   23.556    0.105   0.09
> 
> Before last sunday the hp dl580 don't had this ntp problem

What do your system logs tell you about changes made last Sunday that
might cause this problem?

Also, please show your entire ntp.conf.  It sounds as if you may not
have been running the Debian default setup, which should be something
like this, IIRC:

...
# By default, exchange time with everybody, but don't allow configuration.
restrict -4 default kod notrap nomodify nopeer noquery
#restrict -6 default kod notrap nomodify nopeer noquery

# Local users may interrogate the ntp server more closely.
restrict 127.0.0.1
#restrict ::1


-- 
Stan


Reply to: