[time-nuts] Allan variance by sine-wave fitting

Wed Nov 22 22:51:36 UTC 2017

Hi,

On Wed, 22 Nov 2017 07:58:19 -0800
Ralph Devoe <rgdevoe at gmail.com> wrote:

>       I've been working on a simple, low-cost, direct-digital method for
> measuring the Allan variance of frequency standards. It's based on a
> Digilent oscilloscope (Analog Discovery, <$300) and uses a short Python
> routine to get a resolution of 3 x 10(-13) in one second. This corresponds
> to a noise level of 300 fs, one or two orders of magnitude better than a
> typical counter. The details are in a paper submitted to the Review of
> Scientific Instruments and posted at arXiv:1711.07917 .

I guess, you are asking for comments, so here are some:

> The DD method has minimal dependence on analog circuit issues, but
> requires substantial hardware and firmware to perform the I/Q
> demodulation, synthesize the local oscillators, and process the
> data through the digital logic.

This is not true. The logic needed for the mixing process is minimal
and easily done [1]. It is actually so fast that you can do it on
your PC in realtime as well. Ie you could just pipe those 100Msps
your system gets into your PC, and do the DD method there. Though
that would be a huge waste of resources. It's much more efficient
to have the tiny bit of logic in an FPGA (which you need anyways
to interface the ADC to the PC) and do the down-conversion and
decimation there.

Additionally, your method has to synthesize a local signal as well
and not just once, but many times until a fit is found.

> The limiting ADEV of that method is comparable to that of DMTD systems,
> with commercial devices specifying ADEV of 10^-13 to 10^-15 for t = 1 sec[10].

An order of magnitude is not comparable. It's a huge difference in this
area. Also, I am surprised you haven't referenced [2], which has been
discussed a few times here on the time-nuts mailinglist. The SDR method
Sherman and Jördens used is, as far as I am aware of, the current record
holders in terms of phase difference measurement precision.

> The resolution derives from the averaging inherent in a least-square fit:
> a fit of O(10^4) points will enhance the resolution for stationary random
> noise processes by a factor O(10^2). 

This is a very cumbersome way of writing that if you average over
a Gaussian i.i.d random variable, the standard deviation will go
down with square root of the number of samples. 

Please note here that you have the inherent assumption that your
random variable is Gauss i.i.d. This assumption is _not_ true
in general and especially not in high precision measurments.
You need to state explicitly why you think your random variable (noise)
is i.i.d and need to verify it later.

Also, writing O(10^4) is not how O-notation works. See [3] for details.
People I work with would not consider this not just sloppy notation,
but plain wrong. So please be careful about it.

> Since the fit phase depends on the values of the signal over many cycles,
> it is insensitive to artifacts near the zero-crossing and to dc offsets.

This is not true. The quality of the least-square fit of a sinusoid
depends to quite big extend on the correct estimation of the DC offset.
Hence low frequency noise on the signal will detoriate the quality of
the fit.

> Data analysis was performed by a Python routine which read each file and
> performed a least-squares fit of the unknown and reference signals to a
> function of the form
>                   S(t) = A sin (2*pi*f0 t + phi) + epsilon     
> where A is the amplitude, pi the phase, f0 the frequency, and epsilon the
> DC offset. The fitting routine used was the curve fit method of the
> scipy.optimize library. The fit finds the values of these parameters which
> minimize the residual R defined by
>              R =1/N*sum^N_i=0(|D(i)-S(i*t_c)|^2)

You are doing a linear least squares fit over a non-linear function.
Yes, one can do that, but it will not going to be good. Your objective
function is definitely not well chosen for the task at hand. Due to the
way how LS works, the estimate quality for the amplitude will have an
effect on the estimate quality of your phase. Or in other words,
amplitude noise will result in estimation error of your phase, additional
to the phase noise you already have. I.e. you have an inherent AM noise
to PM noise conversion. You do want to change your objective function such,
that it rejects amplitude noise as much as possible. The DD and DMDT
approaches do this by the use of a phase detector. 

> The inset shows a histogram of the data which follows a Gaussian
> distribution with a standard deviation sigma of 220 fs. This is somewhat
> greater than the timing jitter specified for this ADC which is about
> ~= 140 fs. Note that during this run, which lasted for 1000 sec, the
> phase of each channel varied over the full 100 nS period of
> the inputs, while their difference had a sigma = 220 fs, a ratio of
> 3e-6

Note that the aperture jitter for the ADC is an _absolute_ 
value, which is larger than the relative aperture jitter between
the two channels. The reason why your jitter is so large could be
also due to a noisy local oscillator, which does not fully cancel
out.

> One of the simplest methods of estimating the power of a fitting algorithm
> is to compute the Cramers-Rao lower bound[14]. In this case we want to 
> estimate the standard deviation of the phase sigma_phi resulting from a
> least-square fit to a noisy sinusoidal signal. Assume a sinusoidal signal
> of amplitude A with white Gaussian additive noise with a standard deviation
> sigma_A . Then the Cramers-Rao lower bound is given by
>     sigma_phi = (sigma_A*C)/(A*sqrt(M))

First: It's Cramer, not Cramers. 
Second: Cramer-Rao works only for unbiased estimators correctly.
You have not given any indication why sigma_phi should be unbiased.
Third: you are mixing phase noise and amplitude noise. While it is
true that, if you add random Gauss i.i.d distributed noise on top
of a signal, half of the noise energy will be in the phase noise.
But you do neither mention this, nor process it correctly. Instead
you assume that the complete noise energy ends up in phase noise.
If you were doing an upper bound estimation, this would work, but
for a lower bound it's wrong.
And last but not least: You are using an equal sign where you have
an inequality. One could argue that this is sloppy notation, but
as we see below, it leads to an important mistake:

>         sigma ~= 1/(2*pi*f_o*2^N*sqrt(M)

Suddenly, your inequality has become an approximation. What was a
lower bound before, is now an estimate. This is a big mistake.

> It is important to emphasize that this result is a statistical lower
> bound assuming only quantization noise. For the conditions of the above
> experiment, where N ~= 12, M = 4096, and f0 = 10 MHz, Eq. 6 yields
> sigma = 60fs fs

Where does the N=12 come from? You have a 14bit ADC.

> The analysis program computes the difference of two uncorrelated fits
> for the signal and reference channels, so that the errors add in quadrature,
> giving a lower bound of 85 fs. This should be compared to the 220 fs measured
> value. Note that the differential timing jitter of the ADC[13] is ~140 fs,
> which may account for the difference. .

You assume here that the noise on the two channels is not only i.i.d,
but independent from each other. This is a very strong assumption.
Especially considering you have a dual-channel ADC where a lot of the
noise sources are coupled and shared between the ADCs. You can argue
that due to this correlation in noise, they cancel out when taking
the difference. But then the 60fs would be the correct value.
Additonally, you measured the same input singal twice. If we would
assume the ADC to be ideal and have no noise beside quantisation
(which is what you have imlicitly done), then you should get zero
difference in the phase estimates. Which reveals another flaw in
your analysis.
But the main reason why your estimate does not fit what you see is because
you used a lower bound. And as the name implies, it is a lower bound.
If your measured error would be smaller, that would be reason for
concern.

> There is a simple graphical construction which provides a more intuitive
> explanation of Eq. 6. Consider first how a single ADC measurement can be
> used to predict the zero-crossing time of a sinusoid, which is equivalent
> to a phase measurement. Assume that the measurement is made during a linear
> part of the sine function, e.g., where the sine is between -0.5 and + 0.5 so
> that sin(x) ~= x. Further assume that the ADC measurement yields the integer 
> value n, that is, an amplitude n/2^(N-1) . The exponent N-1 arises since the
> N bits must cover both polarities. Then an extrapolation to the zero
> crossing at time t0 gives

Beside that you limit yourself to only quantization noise, this part
of the analysis is, as an estimate, correct. Though, assuming, withouth
any further explanation, that half of the samples contribute to the zero
crossing measurement is a bit iffy. The real value would be sqrt(2)/2, btw
(simplified to the point it's almost wrong: Half of the estimation "power"
goes to the phase, the other half goes into amplitude, hence sqrt(1/2) of
the samples contribute to the phase)

> This derivation separates the resolution into two factors:
> the single-shot timing resolution ~ 1/2^N and an averaging factor of 1/M .
> Since the predicted resolution of ~= 100 fs is about 10^-6 of the 100 nS
> period of f0 it is important to realize that ~ 10^4 of this (10 pS)
> is due to the ADC timing resolution and only ~10^2 is due to averaging.

Uhmm.. you make here a lot of implicit assumptions about the statistical
properties of the noise involved. None of them are motivated in any way.
And with quite a few I am not sure whether they hold true.

In general, your method works and does lead to decent results.
But it is not any more computationally efficient than the DMDT or DD
methods. What you do is to hide a lot of the complexity in a scipy
function, which is neither efficient nor fast. Your noise/jitter analysis
lacks lots of mathematical rigor and has a lot of implicit assumptions that
are neither clearly stated nor generally true. Panek did a noise to jitter
analysis in [4] in a setting that is not unlike yours. Even though  his
analysis has problems as well (e.g. he does not account for the numer of
samples taken), he is much closer to the what is going on.
If you adapt his analysis to your case, you would get a much better result.

And a final nitpick: Please use s for seconds instead of sec.
Also, please note that the s is in lower case, i.e. it's ns and µs
not nS or µS. All SI units that are not derived from names are written
in lower case [5].

			Attila Kinali

[1] "The CORDIC Trigonometric Computing Technique*", by Jack E. Volder, 1959

[2] "Oscillator metrology with software defined radio", by Sherman and Jördens,
2016

[3] https://en.wikipedia.org/wiki/Big_O_notation

[4] "Random Errors in Time Interval Measurement Based on SAW Filter
Excitation", by Petr Panek, 2008

[5] https://en.wikipedia.org/wiki/International_System_of_Units#General_rules

-- 
You know, the very powerful and the very stupid have one thing in common.
They don't alters their views to fit the facts, they alter the facts to
fit the views, which can be uncomfortable if you happen to be one of the
facts that needs altering.  -- The Doctor