[time-nuts] Tracking NTP displacement and correlation betweentwo clients.
hmurray at megapathdsl.net
Fri Oct 5 18:26:14 EDT 2012
bownes at gmail.com said:
> The problem is that they start in sync and over the course of a day drift
> that far apart despite having NTP running. We're not sure why NTP isn't
> correcting it along the way. Though at this point, we are looking at a
> firmware bug.
I wouldn't think of it as two systems drifting apart, but rather at least one
system with a broken clock.
Is it only one system that is broken?
How many systems do you have running the same firmware? OS? Hardware?
Are the two systems that drift apart running on the same hardware and OS?
Do any other similar systems have troubles?
I wouldn't rule out an OS or ntpd bug.
It's fairly easy to set up a system to monitor the time on several/many other
systems. For each system you want to monitor, add a line like this to your
server xxxx noselect minpoll x maxpoll y
ntpq -p will quickly show you any boxes that are way out of tune. Anything
off by a second will stand out. Or scan rawstats or peerstats.
noselect goes through all the work of polling the target site, including
logging, but then discards the data rather than using it to control the local
clock. It's great for monitoring other systems.
Normally, if ntpd is off by more than 128 ms, it will step the clock. That
puts a line in the log file. So it's more than a bit strange that the clocks
get off by many seconds.
I'd double check that ntpd really is still running.
Are your drift-apart systems using only your 2 local stratum-2 servers? If
so, that may be the problem. If those servers don't agree, which one do you
believe? (There is endless discussion in the NTP community about how many
servers you need. 3 lets you out-vote 1 bad guy. 4 lets you out-vote a bad
guy if one of them is down. ...)
These are my opinions. I hate spam.
More information about the time-nuts