[time-nuts] The need for quartz crystals and mains frequency (was: Mains Frequency)

Sat Feb 13 01:55:34 UTC 2021

Hi Andy and Attila,

On 2021-02-13 01:27, Attila Kinali wrote:
> On Fri, 12 Feb 2021 18:23:54 +0000
> Andy Talbot <andy.g4jnt at gmail.com> wrote:
>
>> Why should the microcontroller have a crystal at all?
> Because you need accurate time or frequency.

Often stability is issue. Depending on what you do, either or both may
be significant.

Also, these days, it may not be crystal but other forms of reference may
be used, such as SAW and MEMS devices.

Size, power need and relatively high Q of device is relevant parameters.
Ability to have sufficient environmental effect stability, such as
described in IEEE Std 1193, also kicks in, but that is a variant to the
accuracy and stability issue. The Q being relevant, as explained in the
Leeson model, it is key to the phase-noise of the oscillator formed,
along the white and flicker noise of the oscillator. There is a number
of details how that plays out, but we are on the hand-waving part of it
here, just to roughly show the shapes of how things fit together.

>
> E.g.: You have a USB connected device. The USB specs say
> that the reference clock for the device must be accurate
> to 0.2% (2000ppm) under all operation conditions (including
> temperature). Yes, modern USB device implementations can get
> away with a less accurate reference clock by locking the local
> clock to the frame clock comming from USB. But that only works
> for some classes of devices (i.e. has to run with 12MBit/s or less).
> And it does not work for anything that can also be a USB host
> as well (aka USB on the go).

For Ethernet, you've been in the +/- 100 ppm range since it's conception
essentially. Requirements like that is common. In my line of business,
as loose as +/- 100 ppm is used for "rough" clocks for internal use only
or non-important Ethernet ports.

>
> Or: I was involved in the design of a logging device for shipment
> tracking for insurance reason. Requirement from customer was to
> achieve better than 10minutes over 2 years. That's 20ppm.
> And we only got 10minutes after we told them that 1minute was
> not physically possible given the size and power constraints.
> And even that we only achieve when the parcel is constantly
> in an air conditioned room, which, of course, is never the case.
>
> Or: Any kind of radio/wireless application. Channel separation
> requirements, even for low speed ISM band stuff are stringent
> enough that you have to select your crystal carefully and can't
> just take the cheapest one. Things that operate within the
> 2.4GHz band, like BT/BTLE, are even worse.
>
> BTW: IoT devices are currently one of the major drivers behind
> more accurate 32kHz crystals. Whether you have to wake up
> for 10ms every hour or for 100ms makes a huge difference in
> battery lifetime (in the order of factor 5). Similarly, cellphones
> are a driving force behind (small) AT cut crystal accuracy.. 
> or rather short-term drift. As less frequency drift means smaller
> guard bands between different channels and within a channel. Which
> directly translates into higher frequency utilization and thus
> available bandwidth and money.

The phones actually lock up to the base-station. Depending on mode, it
locks up frequency or frequency and phase. The base-stations typically
have frequency requirement to be +/- 50 ppb. For some, the phase of the
base-station becomes important, to assist in hand-over and lately SFN
transmissions to a mobile from multiple towers/antennas.

However, for this to be meaningful, you need a relatively accurate and
stable oscillator to start with, that ends up being crystal, SAWR and/or
MEMS oscillators.

>
> And we haven't even talked about anything that does precision
> stuff, where having an accurate and stable clock source is often 
> paramount for having accurate measurment. Neither have we talked 
> about anything highspeed (i.e. beyond 50MHz) where timing margins 
> become low enough that being even 0.1% off would not do.

A very important aspect for any communication link is the bit-error rate
(BER), which comes from the combination of the jitter tolerance and the
random noise jitter.

The jitter tolerance curve is a fantastic compliance tool for signal
integrity design and testing. It is there to check that systematic
(non-random) noise does not break the bit error rate. So, it's modeled
as the largest amplitude sine phase-modulation of a frequency can have.
So, if an output from a device is lower than the jitter tolerance curve,
and the input tolerate more than the jitter tolerance curve, there is
compliance between the devices. The lack of tolerance comes from the
bandwidth of receiver PLL and the size of jitter damping buffers. The
shape of them is such that tolerance increase for lower frequencies and
tend to have 6 dB/Oct slopes or flat levels (buffer-size). So, the
jitter tolerance put requirements of PLL bandwidth and buffer size. In
the source side, it put requirements on how much jitter control and
environment issues may pop out. Turns out, when these things does not
comply, compliance issues is real, and I've had to handle that. A rule
of thumb used is that first PLL bandwidth shall be 1/1667 of the
baudrate of the signal, and that is respected by FibreChannel and To
some degree also by Gigabit Ethernet (which actually inherited the
bandwidth number from Fibre Channel even if it has a slightly higher
baudrate). For lower frequencies, the same basic porperties is shifted
over to the MTIE curves, which cut-over at about 10 Hs and 0.1 s, but
then convey the same requirement of sinusoidal response limit. Random
noise tends to have TDEV limits, but at high frequency you end up
needing to have that below the limit of the jitter tolerance curve upper
end. This will need to be below 1/14 of a symbol length in RMS value for
the BER to be below target of 1E-12, so the jitter tolerance curve end
up being 0.07 UI, where unit interval is the length of a symbol. All
these things ends up putting requirements on our clocks and how we build
that. Trust me when I say it hurts when it does not work and we have not
respected the requirements. Quite a bit of oscillators had to be
measured just to be selected out from being used for the reference of
the high speed links.

Now, on top of that, if your clock runs "hot" compared to the receivers
clock, the full rate stream of packets will quickly overflow the
receiver, so it will become a limit to how long back-to-back packets may
be transmitted before one will be dropped because of buffer overflow.
Sure, that one hurt too, so +/- 2000 ppm, yeah now you can have 4000 ppm
difference and hence for 250 packets sent, one more packet will pile up
in the buffer and as it overflows, every 250th packet will be dropped.
If you have 16 packet buffer, sure it takes a little bit of time for the
buffer to fill up, but then on 1/16th of that time you loose another
packet. As you move to +/- 100 ppm you are now to every 5000th packet,
which is still terrible in BER but more tolerable for the corner case.

>
>
>> Many have factory trimmed RC oscillators, typical 1% accuracy, because
>> accurate timing for other than timekeeping is rarely needed.
> Keep in mind that the 1% RC oscillator is something relatively
> new and they are 1% only at 25°C. Just 10 years ago, you
> were lucky to get a device with an internal oscillator that would
> be +/-10% at 25°C and 30% over temperature. Even a modern device
> like the STM32F7xx family (IIRC 2-3 years old) is spec'ed at 4%
> over temperature.
Not to speak about the effective Q and the respective phase noise response.
>
>> A minute per month is 10ppm, typical of a bog standard crystal, and given
>> the choice of that or mains timing for a clock, I'd use the latter any day.
> A standard AT cut crystal is 10-100ppm accuracy out of factory
> at 25°C and with 100% accurate capacitive loading. After soldering,
> you are probably off by another 10-30ppm. And, depending on the 
> actual cut angle, temperature variations add another 20-100ppm
> on top of that. Yes, the "10ppm" value is misleading.
>
> If you are talking about a 32kHz crystal, than its quadratic
> temperaturure becomes a problem, E.g. at 0° you are already
> off by an addtional -22ppm, at -10°C it's -43ppm. If we go
> to the other extreme, it's -71ppm at 70°C and -106ppm at 80°C.
> Those numbers, are of corse, if the temperature coefficient is
> nominal. If you take the maximum tempco from the specs, the
> numbers become -55ppm (-10°C), -28ppm (0°C), -91ppm (70°C)
> and -136ppm (80°C). And we are still talking about quality
> crystals here, with tightly controlled specs. A run of the
> mill el-cheapo crystall will be quite a bit worse. Crystals
> with >200ppm deviation over temperature are not uncommon.
The AT-cut crystal actually has a cubic temperature response.
> Yes, this is a reason why Microcrystal crystals costs several
> times of what you'd pay in China. And people are happy to pay that
> premium as it shaves off a few dozen ppm from the end product and
> crystals exhibit less aging, which in turn makes calibration
> techniques work better. (My Swiss watches, after 20 years, are still
> within 2ppm of nominal frequency over complete temperature range).

I've used Microcrystal crystals, and I've used them as reference because
for their size they have been pretty darn good. So, when comparing it
with others in the same DIP14 package, I used to toy and these
oscillator slave-folks that I could feel the quality of them just by
holding them in my hand. The main claim for that performance was that
the Microcrystal/Oscilloquartz had was that the crystal as mounted
inside a "coffin" of FR-4 board, giving it a fair amount of thermal
stability just from the pure weigh, which I naturally could feel as I
was tossing them around in my hands. That helped creating stability from
thermal variations, which turns out to be quite important in our type of
equipment, where density of electronics forces us to used forced
convection using those pesky fans, driving the variations of the
temperature in the room and the AC onto the board and hence oscillator.
These oscillator handled it the best, and we still preferred to protect
them some. Even as you lock things up, the bandwidth of the lock PLL
becomes a compromise of competing requirements, and in the end also the
jitter tolerance we see. Most fun is naturally when the requirement is
kind of far to find strict definitions for.

Let's just say that using not only crystal oscillators but good crystal
oscillators ended up being the only viable option.

>
> And, please do not forget that modern mains frequency control
> is something quite recent as well. Especially outside (west) Europe.
> Having mains frequency powered clocks being off several minutes
> per month was the norm 50-70 years ago. This is, what drove
> people to buy quartz crystal clocks back in the days.
> Also, events have shown us, that gaining/losing a minute or two
> within a month is still something you have to worry about
> even with modern mains frequency control. Now think about places
> where people don't have the Swiss, with their pedantic time keeping,
> taking care of mains frequency.

Yeah, I have a bunch of swiss made clocks here, 6 of which is cesium and
1 which is hydrogen. :)

But for mains-driven clocks, for every day life like ovens or
alarm-clock, sure the mains on average keep fairly good long-term drift
that you can handle it. However, if you look at just how much it can
deviate from actual time, and there is entertaining stop-frame movies
illustrating that, you end up wanting to use something more stable. If
you want your computer to be within 1 s, powergrid is not very helpful
really. I have a HP counter that in standard form uses the 50 or 60 Hz
as reference, but I am fortunate to have the 100 kHz high stability
reference, which is a crystal oscillator.

So, from many forms crystal oscillators ends up providing key
performance we need as we engineer things, through a dispersed set of
ways that stability and accuracy ends up affecting us. Actually, if you
look at alternatives, doing that part right ends up making the rest of
the design so much less expensive, it is the cheap alternative. Also,
for higher performance equipment, it just lays the ground floor to build
upon. This is also true for synchronization. Learn how to do it right,
and you get a simpler, leaner design that just works robustly.

Cheers,
Magnus