[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Help understanding NTP behaviour



Hello list,

I'm not sure if this is necessarily Debian-specific (could be, I don't know of any implementation differences) but I'm hoping someone here has been in the same boat before.

One of our wireless subscriber networks has a bunch of Canopy gear from Motorola. In a firmware update a year or so ago they added the ability for their CMMs (GPS receiver/switches/POE injectors all in one) into nice GPS-powered NTP servers. The exact performance of these I am not sure of.

I got it into my head that it would be nice to serve time to our network from our own little network of GPS receivers. I've built two Debian servers for this function, but there are problems. Since these devices are wireless, I understand that there will be jitter. What I don't get is the offset varying so much! Frequently you can check in on any of the two servers and see wild numbers for the GPS receivers. I restarted ntpd on both hosts a few minutes ago so the numbers will be all crazy for a while. What confuses me is the offsets. Other times I will go and check in on these and it will be random which ones are within a few ms, which are +1000 and which are -1000:


TIME-SRV-A:/etc# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 172.19.10.36    .STEP.          16 u  112  128    0    0.000    0.000   0.000
 172.20.8.10     .GPS.            1 u   20   64    1    6.485   -0.335   0.000
 172.20.8.20     .GPS.            1 u   20   64    1    6.445   -0.331   0.431
 172.20.8.30     .GPS.            1 u   32   64    1    5.190    0.313   0.000
 172.20.12.5     .GPS.            1 u   20   64    1    6.900   -0.569   0.579
 172.20.12.10    .GPS.            1 u   19   64    1    5.578    0.120   0.095
 172.20.12.20    .GPS.            1 u   18   64    1    3.156  -998.66 1000.07
 172.20.12.30    .GPS.            1 u   17   64    1   14.958   -4.571  14.381
 172.20.12.35    .GPS.            1 u   16   64    1    6.506   -0.339   0.005
 172.20.12.40    .GPS.            1 u    1   64    1    3.634    1.071 999.984
 172.20.12.50    .GPS.            1 u    1   64    1    6.595   -0.393 1000.20
 172.20.16.10    .GPS.            1 u    2   64    1    7.462   -0.833   1.190
 216.234.161.11  69.25.96.13      2 u   14   64    1   53.982  222.394   1.554
 216.194.70.2    132.163.4.103    2 u    -   64    1   51.324  220.696   5.650
 67.212.74.220   64.90.182.55     2 u    7   64    1   34.935  231.701   7.908
 74.3.161.36     140.142.16.34    2 u    -   64    1   70.567  239.736   5.535
 127.127.1.0     .LOCL.          10 l    -   64    0    0.000    0.000   0.000


TIME-SRV-B:/etc# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 172.19.10.4     172.20.8.30      2 u   35   64    0    0.000    0.000   0.000
+172.20.8.10     .GPS.            1 u    5   64    7    6.670    0.746 655.461
*172.20.8.20     .GPS.            1 u    5   64    7    6.666    0.737 378.067
+172.20.8.30     .GPS.            1 u    2   64    7    5.411    1.414 534.611
x172.20.12.5     .GPS.            1 u   65   64    7    6.588  -999.19 755.154
x172.20.12.10    .GPS.            1 u   65   64    7    7.054  -999.40 756.159
+172.20.12.20    .GPS.            1 u   63   64    7    3.601    2.290 654.686
+172.20.12.30    .GPS.            1 u   63   64    7    8.712   -0.287 534.525
x172.20.12.35    .GPS.            1 u   39   64   37    6.475  -999.13 925.761
x172.20.12.40    .GPS.            1 u   39   64   37    4.294  -998.05 845.099
+172.20.12.50    .GPS.            1 u   47   64   17   10.292   -1.035 654.064
+172.20.16.10    .GPS.            1 u   50   64   17    6.817    0.703 654.670
-205.189.158.228 209.87.233.53    3 u   39   64   37   28.000   56.084  31.056
-184.107.229.26  209.51.161.238   2 u   49   64   17   27.789   69.598  30.689
-208.69.56.110   209.51.161.238   2 u   51   64   17   34.166   64.142  30.792
#199.85.124.148  209.87.233.53    3 u   36   64   37   36.244   67.480  31.339
 127.127.1.0     .LOCL.          10 l  391   64  100    0.000    0.000   0.000


The two boxes were originally identical, but in troubleshooting I've changed some settings to no avail:

TIME-SRV-A:
	IBM HS20 blade
	ACPI timing only
	P4-based Xeon, 2 GB Reg. ECC DDR2, flat configuration
	LSI SAS hardware RAID1
	Debian 6.0.3, Linux 2.6.32-5-amd64 #1 SMP Mon Oct 3 03:59:20 UTC 2011 x86_64 GNU/Linux
	Dual bnx2 NICs (tg3 driver?), only single Gbit Ethernet enabled

	*** NTP.CONF FROM TIME-SRV-A ***
	driftfile /var/lib/ntp/ntp.drift
	
	statistics loopstats peerstats clockstats
	filegen loopstats file loopstats type day enable
	filegen peerstats file peerstats type day enable
	filegen clockstats file clockstats type day enable

	keys /etc/ntp.keys
	trustedkey 9
	
	peer TIME-SRV-B key 9 iburst
	
	# CMMs in Wireless Land
	server 172.20.8.10  iburst
	server 172.20.8.20  iburst
	server 172.20.8.30  iburst
	server 172.20.12.5  iburst
	server 172.20.12.10 iburst
	server 172.20.12.20 iburst
	server 172.20.12.30 iburst
	server 172.20.12.35 iburst
	server 172.20.12.40 iburst
	server 172.20.12.50 iburst
	server 172.20.16.10 iburst
	
	# Regular NTP servers for backup
	server 0.debian.pool.ntp.org iburst
	server 1.debian.pool.ntp.org iburst
	server 2.debian.pool.ntp.org iburst
	server 3.debian.pool.ntp.org iburst
	
	# Local backup clock
	server  127.127.1.0
	fudge   127.127.1.0 stratum 10

	restrict -4 default kod notrap nomodify nopeer noquery
	restrict -6 default kod notrap nomodify nopeer noquery
	restrict 127.0.0.1

	restrict 172.16.0.0   mask 255.240.0.0 nomodify notrap
	restrict 192.168.0.0  mask 255.255.0.0 nomodify notrap
	

TIME-SRV-B:
	IBM HS21 blade
	HPET enabled, and same BIOS options we use for ESXi hosts
	Xeon 5140, 4 GB Reg. ECC DDR2, "sparing" config so only 2 GB usable 
	LSI SAS hardware RAID1
	Debian 6.0.4, Linux 2.6.32-5-amd64 #1 SMP Sat May 5 01:12:59 UTC 2012 x86_64 GNU/Linux
	Dual bnx2 NICs (tg3 driver?), LACP bonding enabled
	Follow suggestions at http://www.math.ucla.edu/~jimc/documents/bugfix/12-ntp-wont-sync.html

	*** NTP.CONF FROM TIME-SRV-B ***
	driftfile /var/lib/ntp/ntp.drift
	
	statistics loopstats peerstats clockstats
	filegen loopstats file loopstats type day enable
	filegen peerstats file peerstats type day enable
	filegen clockstats file clockstats type day enable
	
	keys /etc/ntp.keys
	trustedkey 9
	
	peer TIME-SRV-A key 9 iburst
	
	# CMMs in Wireless Land
	server 172.20.8.10  iburst
	server 172.20.8.20  iburst
	server 172.20.8.30  iburst
	server 172.20.12.5  iburst
	server 172.20.12.10 iburst
	server 172.20.12.20 iburst
	server 172.20.12.30 iburst
	server 172.20.12.35 iburst
	server 172.20.12.40 iburst
	server 172.20.12.50 iburst
	server 172.20.16.10 iburst
	
	# Regular NTP servers for backup
	server 0.debian.pool.ntp.org iburst
	server 1.debian.pool.ntp.org iburst
	server 2.debian.pool.ntp.org iburst
	server 3.debian.pool.ntp.org iburst
	
	# Local backup clock
	server  127.127.1.0
	fudge   127.127.1.0 stratum 10
	
	restrict -4 default kod notrap nomodify nopeer noquery
	restrict -6 default kod notrap nomodify nopeer noquery
	restrict 127.0.0.1
	
	restrict 172.16.0.0   mask 255.240.0.0 nomodify notrap
	restrict 192.168.0.0  mask 255.255.0.0 nomodify notrap


I can't find any network-related explanation for this. The GPS units have different wireless backhaul on different frequencies off of a fiber-fed system. Our network is running MPLS and I don't see any congestion or load balancing that could explain something like this.

Any suggestions or advice would be greatly appreciated!

Thanks

---
Ross Halliday
Network Operations
WTC Communications

Office: 613-547-6939 x203
Helpdesk: 866-547-6939 option 2
http://www.wtccommunications.ca



Before I hit send, here are two more examples from TIME-SRV-B:

TIME-SRV-B:/etc# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 172.19.10.4     LOCAL(0)        11 u   38   64    1    5.717  -174.29  16.433
x172.20.8.10     .GPS.            1 u   46   64  177    6.779  -999.27 925.589
+172.20.8.20     .GPS.            1 u   48   64  177   10.386   -1.088 533.666
*172.20.8.30     .GPS.            1 u   40   64  177    5.412    1.420 378.064
x172.20.12.5     .GPS.            1 u   45   64  377    9.923  -1000.8 926.336
+172.20.12.10    .GPS.            1 u   46   64  377    6.531    0.841 534.398
+172.20.12.20    .GPS.            1 u   42   64  377    3.479    2.380 534.558
x172.20.12.30    .GPS.            1 u   37   64  377    6.826  -999.31 911.511
x172.20.12.35    .GPS.            1 u   16   64  377    5.750  -998.79 844.180
+172.20.12.40    .GPS.            1 u   15   64  377    5.676    1.247 654.208
+172.20.12.50    .GPS.            1 u   25   64  377    6.731    0.747 534.527
+172.20.16.10    .GPS.            1 u   28   64  377    9.888   -0.849 653.950
-205.189.158.228 209.87.233.53    3 u   12   64  377   28.576   95.607  35.210
-184.107.229.26  209.51.161.238   2 u   23   64  377   27.660  110.014  36.308
-208.69.56.110   209.51.161.238   2 u   24   64  377   35.119  104.465  35.787
#199.85.124.148  209.87.233.53    3 u   10   64  377   36.519  107.890  36.206
 127.127.1.0     .LOCL.          10 l  703   64    0    0.000    0.000   0.000
TIME-SRV-B:/etc# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
#172.19.10.4     LOCAL(0)        11 u   49   64    6    5.729  -189.66  25.368
+172.20.8.10     .GPS.            1 u   50   64  377    8.411   -0.079 534.256
+172.20.8.20     .GPS.            1 u   49   64  377    8.182    0.014 377.755
+172.20.8.30     .GPS.            1 u   43   64  377    5.020    1.607 534.943
+172.20.12.5     .GPS.            1 u   47   64  377    6.364    0.913 535.021
+172.20.12.10    .GPS.            1 u   48   64  377    5.307    1.454 534.725
x172.20.12.20    .GPS.            1 u   42   64  377    3.599  -997.68 925.621
*172.20.12.30    .GPS.            1 u   38   64  377    5.414    1.399 378.568
+172.20.12.35    .GPS.            1 u   17   64  377    5.545    1.343 654.863
+172.20.12.40    .GPS.            1 u   17   64  377    4.283    1.965 534.544
x172.20.12.50    .GPS.            1 u   28   64  377    5.678  -998.73 844.263
+172.20.16.10    .GPS.            1 u   30   64  377    6.112    1.048 378.634
#205.189.158.228 209.87.233.53    3 u   16   64  377   27.839  111.925  35.568
#184.107.229.26  209.51.161.238   2 u   26   64  377   27.297  125.874  36.176
-208.69.56.110   209.51.161.238   2 u   27   64  377   34.639  120.508  35.971
#199.85.124.148  209.87.233.53    3 u   12   64  377   36.132  123.903  36.155
 127.127.1.0     .LOCL.          10 l  840   64    0    0.000    0.000   0.000


Reply to: