[time-nuts] clock-block any need ?

Thu Jan 3 03:18:04 UTC 2013

On 02/01/13 02:57, Dennis Ferguson wrote:
>
> On 27 Dec, 2012, at 15:13 , Magnus Danielson<magnus at rubidium.dyndns.org>  wrote:
>> On GE, a full-length packet is about 12 us, so a single packets head-of-line blocking can be anything up to that amount, multiple packets... well, it keeps adding. Knowing how switches works doesn't really help as packets arrive in a myriad of rates, they interact and cross-modulate and create strange patterns and dance in interesting ways that is ever changing in unpredictable fashion.
>
> I wanted to address this bit because it seems like most
> people base their expectations for NTP on this complexity,
> as does the argument being made above, but the holiday
> intervened.  While I suspect many people are thoroughly
> bored of this topic by now I can't resist completing the
> thought.

Be advised that it was the short description of a much lengthier discussion.

> Yes, the delay of a sample packet through an output queue
> will be proportional to the number of untransmitted bits in
> the queue ahead of it, yes, the magnitude of that delay can
> be very large and largely variable and, even, yes, the
> statistics governing that delay may often be unpredictable and
> non-gaussian, exhibiting dangerously heavy tails.  The thing is,
> though, that this doesn't necessarily have to matter so much.  A
> better approach might avoid relying on the things you can't know.

Hard to avoid fundamental properties of transmission, at least when they 
have been made fundamental properties.

Recall that the queue length is quantized in steps, and that various 
"padding" (preamble-sequence, header, trailer, postamble-sequence) 
occurs. 8-bit quantization is safe to assume as minimum step for GE, due 
to its 8B10B encoding format on the optical channel. For optical GE, 
event resolution is therefore 8 ns.

> To see how, consider a different question: what is the
> probability that any two samples sent through that queue
> will experience precisely the same delay (i.e. find precisely
> the same number of bits queued in front of it when it
> gets there)?  I think it is fairly conservative to predict
> that the probability that two samples will arrive at a non-empty
> output queue with exactly the same number of bits in front of
> them will be fairly small; the number of bits in the queue will
> be continuously changing, so the delay through a non-empty queue
> should have a near-continuous (and unpredictable) probability
> distribution, as you point out, and if the sampling is uncorrelated
> with the competing traffic it is unlikely that any pair of
> samples will find exactly the same point on that distribution.

Yes and no. It is hard to do with a low asking rate, but some properties 
can improve with a high asking rate.

> The exception to this, of course, is a queue length of
> precisely 0 bits (which is precisely why the behaviour
> of a switch with no competing traffic is interesting).  The
> vast majority of queues in the vast majority of network
> devices in real networks are no where near continuously
> occupied for long periods.  The time-averaged fractional load
> on the circuit a queue is feeding is also the probability of
> finding the queue not-empty.  If the average load on the
> output circuit is less than 100% then multiple samples are
> probably going to find that queue precisely empty; if the
> average load on the output circuit is 50% (and that would be
> an unusually high number in a LAN, though maybe less
> unusual in other contexts) then 50% of the samples that pass
> through that queue are going to find it empty.  Since samples
> that found the queue empty will have experienced pretty much
> identical delays, the "results" (for some value of "result")
> from those samples will cluster closely together.  The
> results from samples which experienced a delay will
> differ from that cluster but, as discussed above, will also
> differ from each other and generally won't form a cluster
> somewhere else.  The cluster marks the good spot independent
> of the precise (and precisely unknowable) nature of the statistics
> governing the distribution of samples outside the cluster.  If
> we can find the cluster we have a result which does not depend
> on understanding the precise behaviour of samples outside the
> cluster.

Whenever you want to do this, you need to measure the network more 
furiously, those the asking rate goes up.

> Given this it is also worth while to consider "jitter", which
> intuition based on a normal distribution assumption might suggest
> should be predictive of the quality of the result derived from a
> collection of samples.  In the situation above, however, the
> dominant contributors to "jitter", however measured, are going
> to be the samples outside the cluster since they are the ones
> that are "jittering" (it is that property we are relying on to
> define the cluster).  If jitter mostly measures information
> about the samples the estimate doesn't rely on then it tells you
> little about the samples the estimate does rely on, and hence
> can provide no prediction about the quality of an estimate
> derived from those samples alone.  In fact, in a true perversion
> of normal intuition, high jitter and heavy-tailed probability
> distributions might even make it easier to get a good result
> by making it easier to identify the cluster.  Saying "I see
> a lot of jitter" doesn't necessarily tell you anything about
> what is possible.

I think one has to realize that what queues and scheduling does to 
packet delays, defies the normal "jitter" statistics quite a bit.

The delay varies, and the properties of delay varies. It is an ever 
shifting property. There is however a few know properties of this 
"jitter". For one thing, it always increases the delay (assuming that we 
do not change path in the network).

> While the argument gets a lot more complex in a hurry, and
> too much to attempt here (the above is too much already), I
> believe this general approach can scale to a whole large network
> of devices with queues (though even the single-switch case has real
> life relevance too).  That is, I think it is possible to find a
> sample "result" for which there is a strong tendency for "good"
> samples to cluster together while "bad" samples are unlikely to do
> so, with the quality of the result depending on the population and
> nature of variability of the cluster but hardly at all on the
> outliers, and with the lack of a measurable cluster telling you
> when you might be better off relying on your local clock rather
> than the network.  The approach relies on the things we do know
> about networks and networking equipment while avoiding reliance on
> things we can't know: it mostly avoids making gaussian statistical
> assumptions about distributions that may not be gaussian.  The field
> of robust statistics provides tools addressing this which might
> be of use.

It just isn't a good set of tools. This is why lots of effort has been 
put into research. A few search terms for you: min-TDEV and MAFE

min-TDEV is one of a number of algorithms in which they have applied a 
block-min pre-filter prior to the TDEV measure. As the number of samples 
in the block measure increases, the TDEV measures lowers.

There is a cluster approach and percentile approaches also being looked 
at, but the common trend here is that the asking rate becomes higher, 
much higher.

> I guess it is worth completing this by mentioning what it
> says about ntpd.  First, ntpd knows all of the above, probably
> much, much better than I do, though it might not put it in
> quite the same terms.

Yes and no. NTPD implements impressive filterings. However, it sends far 
to little packets to probe the network delays in order for the filters 
to eat down enough through the jitter. PTP allows for higher asking 
rates, and it is one the things which it has going for it compared to NTP.

>  If you make the assumption that the
> stochastic delays experienced by samples are evenly distributed
> between the outbound and inbound paths (this is not a good match
> for the real world, by the way, but there are constraints...) then
> round trip delay becomes a stand-in measure of "cluster", and ntpd
> does what it can with this.

The wedge dispersion plots is nice. The top and bottom part of the wedge 
holds the min samples of one-way delay in either in-bound or out-bound 
direction. It's not a bad solution, but it needs more samples to chew on.

>  The fundamental constraint that limits
> what ntpd can do, in a couple of ways, is the fact that the final
> stage of its filter is a PLL.

That is the traditional view, yes.

>  The integrator in a PLL assumes
> that the errors in the samples it is being fed are zero-mean and
> normally distributed, and will fail to arrive at a correct answer if
> this is not the case, so if you want to filter samples for which
> this is unlikely to be the case you need to do it before they get
> to the PLL.  The problem with doing this well, however, is that a
> PLL is also destabilised by adding delays to its feedback path,
> causing errors of a different nature, so anything done before the
> PLL is severely limited in the amount of time it can spend doing
> that, and hence the number of samples it can look at to do that.
> Doing better probably requires replacing the PLL; the "replace
> it with what?" question is truly interesting.

The integrator does not expect zero-mean samples. It's infinite gain at 
DC drives the detector to produce zero-mean samples. If a set of samples 
not being average zero comes in, the DC property of those steers the 
integrator state such that the frequency shifts and that the phase ramp 
chases in the property and the phase detector start producing zero-mean 
samples again. This is the properties of the PI-style PLL being used. 
It's how it should be.

To your point, the unstable delay as being measured by NTP causes the 
phase to wobble around. Long term frequency is actually safe, the length 
of the time-stamps ensure that. Phase and frequency stability however is 
affected. It's not the PI-loop that is the culprit, but instability of 
the measure. A Kalman filter for timing turns out to be quite near a 
self-tuned PI-loop BTW. If you want to combat this noise, you need to do 
it with some model of it and means to create a quieter product. Some of 
that is in the public, some of it isn't.

As for delay in the feedback path, this has been systematically 
investigated and there is a lovely paper that shows that to maintain the 
same damping, the bandwidth needs to go down as delays goes up. If you 
have low enough bandwidth, you need to trim your damping coefficient 
instead. It's not a flaw in the traditional PI PLL, it's just that the 
property was not taken into account, and hence applying the wrong model 
and stability analysis to the situation. Doing the homework and you get 
back to safe ground.

> I suspect I've gone well off topic for this list, however, and for
> that I apologize.  I just wanted to make sure it was understood that
> there is an argument for the view that we do not yet know of any
> fundamental limits on the precision that NTP, or a network time
> protocol like NTP, might achieve, so any effort to build NTP servers
> and clients which can make their measurements more precisely is not
> a waste of time.  It instead is what is required to make progress
> in understanding how to do this better.

I think you have misunderstood my intentions here. NTP isn't a bad 
build, it's quite impressive. There is a number of things to improve on 
it. PTP has taken a lead in some fields, but lagging behind NTP in 
others. NTP is not just operating in touch with what I believe is the 
state of art in packet delay measurements for timing. There are several 
things that would needed to be changed in NTP for it to compete well, 
and some of them is in what the standard says, others lies in how the 
standard is being used or the system is being used. You can do a lot 
more within the realm of NTP, but some of the design decisions 
previously made would have to be scrapped.

Cheers,
Magnus