[time-nuts] question on apparent offset between two Endrun CDMA clocks

starlight at binnacle.cx starlight at binnacle.cx
Mon Sep 19 19:45:35 UTC 2011


At Mon Sep 19 05:23:43 UTC 2011, Hal Murray wrote:
>
>>I have a minor mystery and if any ready explanation is 
>>available I would be curious to learn of it.
>>
>> Have two Endrun CDMA time sources: 
>>
>>     ntpd 4.1.1c-rc2 at 1.866
>
>That's quite old.

Ancient, in fact.  :-)

But I got it for $300 a couple of years ago.

>>'ntpq -pn' and 'ntpdate -q' queries from both systems
>>and from other client systems consistently show the
>>Praecis Cntp between 30 and 80 microseconds
>>ahead of the Praecis Cf device.  The median offset
>>seems to be about 60 microseconds.
>
>You might ask Endrun.  (If you get a good answer,
>>please tell us.)

They essentially agreed with my code-path asymmetry
amplified by very slow CPU theory along with the
possibility that HDX carrier switching on the Cntp
might be a distorting factor as well.  They told me
to check the PseudoNoise Offset (PNO) value and this
proved that both CDMA clocks are synced to the same
tower and therefore should read within one or two
microseconds of each other.

>>Round trip time to the Cntp is about 960 microseconds and 
>>round-trip time to the P4/Cf is about 60 microseconds.
>>'ntpdate -d' shows that the Cntp takes a total of 3600
>>microseconds to service a request where the P4 takes 65
>> microseconds.
>
>[I'm not sure what it means to have 3600 uSec service time with 
>960 uSec round trip time.]

I'm pretty sure it means that it takes ~500
microseconds in each direction for network and
IP stack traversal and ~2500 microseconds of CPU
time to process the request on the fabulously slow
Am5x86-WB 133 MHz processor.

>I'm not all that surprised that two similar setups are off by 
>something like  60 microseconds.  60 out of 3600 is pretty small.

Yes.  Hence my amplified code-path asymmetry theory.

>One likely source of quirks is interrupt coalescing which is 
>common on Gigabit ethernet chips.  (That's probably not
>happening if the round trip time is only 60 microseconds.)

Nope, disabled:
options e1000 InterruptThrottleRate=0,0 FlowControl=0,0 copybreak=2048

>ntpq -p and ntpdate give you two different sorts of data.
>
>ntpq tells you what the target thinks is going on.
>ntpdate tells you the difference between the local clock and the 
>target clock.

Yes, knew that but thanks.

>If I wanted more info on this type of quirk, I would
>setup ntpd on a monitoring system using both systems
>as servers and turn on  rawstats.  Use the noselect
>keyword if you don't really want to use it as a server.

Have 'noselect' on everything that is not a
precision time source.  Doesn't seem to make any
difference from stratum 2 relative the two
CDMA-synced clocks so there I just have a 'prefer'
on the Cntp even though the P4 probably has the
better time--it sometimes is busy doing other
additional work and may wander.

>Set minpoll and maxpoll to get the desired
>amount of data.

Always lock the local systems to min/max = 4
(16 seconds), except for the Windows boxes which
are locked to min/max = 0 (1 second).  A quick
experiment showed that Windows keeps much better
time via the network at one second polling than
with 16 second polling of the Ct via a real
(not USB) serial port.  Didn't bother hacking
the Trimble clock driver to poll at one second
since it's clear the problem with Windows is
the poor stability of the multimedia timer
interpolation and that won't ever get fixed.
So no point in direct attaching time sources
under Windows.

>Then graph the results.
>(Poke me off-list if you want more details.)
>
>If you have access to ntp.conf on the servers, turn on 
>loopstats, peerstats, and clockstats.  They might show
>something interesting.

Have done this in the past; too much trouble
at present as I'm pretty sure nothing exciting
will appear.

Awhile ago the graphs did help find a once-or-
twice a day glitch with the Cntp where the
departure time-stamp was set before the arrival
time stamp.  The glitch would biff 'ntpd' into
wild jump and oscillation events till I patched
it to discard insane replies.

----------

My present plan is put up a Trimble Palisade
(Accutime 2000 actually) running on a third system
with PPS for a higher accuracy reference point.  Bit
of a pain as I don't have the converter box and
will have to buy a RS-422 interface, solder up a
custom connector, and back-port Rodolfo Giometti's
PPS kernel patch to CentOS 5's 2.6.18 (can't abide
poor performing newer kernels running on the box best
suited for the GPS source).  Good project for days
when I can't stand doing real work.





More information about the Time-nuts_lists.febo.com mailing list