[time-nuts] Re: Experience writing a millisecond-accurate NTP client on the ESP32

Fri Feb 11 14:08:27 UTC 2022

I did something similar for the ESP8266 -- this is part of the
nodemcu-firmware project. I did actually implement the clock model to slew
the clock rather than insert discontinuities. I did not figure out the
promiscuous mode trick!!

Performance is limited by the wifi stack and the asymmetry introduced by
the 802.11 protocol itself. The local crystal is clearly temperature
sensitive as the adjustment required to keep the clock in sync varies
throughout the day.

Philip

On Fri, Feb 11, 2022 at 1:36 AM Jeremy Elson <jelson at gmail.com> wrote:

> Fellow nuts,
>
> I'd like to report my experience building and characterizing an NTP client
> on the ESP32. My client was able to synchronize to UTC using an Internet
> time server about 4ms away with the following accuracy in terms of absolute
> error:
>
> Median: 216us
> 95th percentile: 801us
> 99th percentile: 1,607us
> 99.9th percentile: 2,461us
>
> --Background--
>
> The ESP32 is a family of very low-cost but highly capable microcontroller
> chips that have become popular with hobbyists. You can buy a complete
> module on Amazon -- dual-core, 160mhz, with built-in WiFi and bluetooth,
> plus a USB interface for programming -- for about $7. The
> previous-generation ESP8266 is sold for the even more absurd price of about
> $4!
>
> Lately I've been interested in trying to collect sensor data with an ESP32
> and report it to a server using the chip's built-in wifi. Of course, I'd
> like the sensor readings to be properly timestamped. Although there are a
> few NTP libraries available for the ESP32, none that I found had reports on
> their performance. In addition, they all did "one-shot" synchronization,
> i.e., at one point in time, determining the current offset of the local
> clock to the NTP timescale. They did not attempt to keep a history of
> observations and do any outlier rejection or correct for local clock rate
> error. I was convinced I could do better.
>
> Searching the archives of this list, I have not seen much discussion of the
> ESP32 other than speculation a few years ago that one might be used as an
> NTP server by attaching a GPS to it.
>
> --Implementation--
>
> I wrote an NTP client for the ESP32 (
> https://github.com/jelson/rulos/tree/main/src/lib/chip/esp32/periph/ntp)
> that periodically sends a request to an NTP server and keeps the history of
> recent observations of the offset between the local clock and the NTP
> server's clock. The one-way latency is assumed to be half the round trip
> time, minus the server's reported processing delay and a calibration
> constant described below. When first starting up, it sends a request every
> 4 seconds until it has 10 responses. Then, in the steady-state, it sends a
> request every 60 seconds, keeping a history of the past 30 minutes of
> observations.
>
> Each time a new observation is received, it performs simple least-squares
> linear regression on all the observations to construct a linear model
> relating the local clock to the NTP timescale. The user interface to my
> library is simple -- "get current epoch timestamp" -- which works by
> reading the local clock and applying the linear model to predict the
> corresponding NTP timestamp. Linear regression has two good effects: first,
> it averages away noise in individual observations (assuming they are
> uncorrelated and unbiased). Second, it models the rate difference between
> the local clock and the NTP timescale, which significantly improves the
> accuracy even if time has elapsed since the most recent observation.
>
> --Reducing Timestamp Jitter--
>
> The ESP32 network stack is built on a lightweight TCP/IP stack called
> "LWIP". It has an API that looks much like the Berkeley socket API, e.g.
> sendto() and recvfrom() to send and receive UDP packets. Unfortunately,
> LWIP has chosen not to propagate interrupt-time reception timestamps up to
> the application as metadata. Early revisions of my NTP client did the same
> thing that all the other ESP32 NTP clients do: record the time before
> sending a UDP packet, and again when we read the data out of LWIP's buffers
> using recv(). This leads to high jitter (on the scale of 10s of ms), as
> timestamp acquisition is delayed behind all packet processing, which itself
> is subject to the whims of the ESP32's FreeRTOS scheduler.
>
> However, the ESP32 does offer a shortcut: the WiFi driver has "promiscuous
> mode" in which user programs can receive a callback directly from the WiFi
> driver whenever a packet is received. This is meant for applications such
> as WiFi sniffing, but it has the nice side effect of giving me a low-jitter
> event to record packet reception times. In fact, the promiscuous mode
> callback contains wifi metadata including a reception timestamp acquired
> directly *in the wifi hardware*. Using that hardware timestamp would
> require some extra work to relate the wifi clock to the system clock, but
> could be another source of jitter reduction that I have not implemented.
>
> There seems to be no similar way to get precise send-time timestamps.
>
> --High RTT Filter--
>
> My tests were all against a public Stratum 2 NTP server run by the
> University of Washington. From my home in downtown Seattle, the RTT to that
> server from a Wifi-connected laptop is about 4ms. However, despite using
> promiscuous mode to reduce jitter on the receive timestamps, my client
> experienced a long tail of jitter in the RTTs, as seen in the graph below:
>
>
> https://www.lectrobox.com/projects/esp32-ntp/graphs/calibrated-wifi-interrupts-long.log.rtt_hist.png
>
> Since my laptop does not see this jitter, it is likely coming from some
> part of the ESP32's network stack or WiFi hardware. Before using
> promiscuous mode for receive-side timestamps, the distribution of errors
> was symmetrical; after adding promiscuous mode, the distribution became
> more one-sided. This leads me to conclude that the remaining jitter is
> primarily variability on the sending side, i.e. ,the time it takes the LWIP
> network stack and underlying WiFi hardware to send a packet.
>
> Of course, these sorts of delays contribute directly to error.  As a simple
> filter, my client discards observations in the top quartile of RTTs before
> feeding the remainder into the linear regression algorithm.
>
> --Latency Asymmetry Correction--
>
> As people familiar with NTP know, its accuracy depends on symmetry in the
> forward and reverse path latencies. Only the round-trip-time can be
> measured directly, so we infer that the one-way latency is half the RTT.
> This assumption is only true if the forward and reverse latencies are
> equal, and any asymmetry in those two latencies contributes directly to
> error. This more-or-less works on the Internet, and though sometimes packet
> queues will increase the latency in one direction or the other, in the
> long-term it's often an unbiased estimate.
>
> My early testing indicated that the latency asymmetry on the ESP32 was
> biased. That is, when I compared the NTP-acquired timestamp to ground truth
> in a long-term test (as described in the next section) I found the errors
> were not centered around zero, but were biased in one direction with a
> median of about 200us, as seen in the histogram below:
>
>
> https://www.lectrobox.com/projects/esp32-ntp/graphs/zerocal-wifi-interrupts-2.log.gps_vs_ntp.hist.png
>
> I suspect this is a difference in the time required to send a packet
> through the ESP32's stack vs the time required to receive it. Note in the
> histogram that the errors are highly asymmetrical; this is because there's
> high jitter in the estimate of when we send a packet, but very little in
> the reception timestamps. I chose to calibrate this asymmetry out, by
> subtracting 200us from the latency estimate as a "hard-coded constant".
>
> --Evaluation--
>
> I evaluated my NTP client by using it to timestamp the PPS output of a
> uBlox M10 GNSS. My antenna has a view of about 180 degrees of sky. The
> uBlox is configured to listen to the GPS, Galileo and GLONASS
> constellations, and during most of this test was using about 20 satellites
> for its solution. Once per second, at the top of each UTC second, the GNSS
> generates a pulse, which I wired into one of the ESP32's GPIO pins. A GPIO
> interrupt handler on the ESP32 gets the current timestamp from the my NTP
> service and emits it to the serial port. I record the results and compute
> the error in the reported timestamp, e.g., a reported timestamp
> of 1644310833.999917 has an error of 83 microseconds. During my tests, the
> ESP32 sent an NTP request to the University of Washington's public Stratum
> 2 NTP server over the internet every 60 seconds. The nominal RTT to that
> server from my home via WiFi is about 4ms. I collected over 100,000 data
> points.
>
> This graph shows time-series data of the errors over the course of the
> experiment:
>
>
> https://www.lectrobox.com/projects/esp32-ntp/graphs/calibrated-wifi-interrupts-long.log.gps_vs_ntp.timeseries.png
>
> Here is a histogram of the absolute errors recorded, excluding the first
> ten minutes of warmup. The median is close to 0, thanks to the calibration
> constant I mentioned in the previous section. Also note that the long tail
> of errors is one-sided (other than some outliers). As described earlier,
> this is because I can acquire low-jitter reception timestamps but can not
> do so for send-time timestamps.
>
>
> https://www.lectrobox.com/projects/esp32-ntp/graphs/calibrated-wifi-interrupts-long.log.gps_vs_ntp.hist.log.png
>
> Numerically, abs(error) can be summarized as:
>
> Median: 216us
> 95th percentile: 801us
> 99th percentile: 1,607us
> 99.9th percentile: 2,461us
>
> --Drawbacks--
>
> My implementation has several drawbacks compared to a real NTP client.
> Perhaps foremost is that it makes no effort to create a monotonic
> timescale. Every time a new observation arrives, a new linear regression is
> performed, and all time queries immediately start using that new model.
> This can result in a discontinuity in the timescale, e.g., two sensor
> observations collected 10ms apart might have timestamps that differ by
> significantly more than 10ms. This is in contrast to the reference NTP
> client which gradually slews the clock with the goal of keeping the local
> timescale continuous and keeps the instantaneous rate close to SI seconds,
> even while adjustments are happening. The ESP32 does support an adjtime()
> call so a continuous timescale should be possible.
>
> Another drawback is that my outlier rejection is not very sophisticated,
> and least-squares fitting is very sensitive to big outliers. The next step,
> if I ever return to this project, would be an iterative algorithm that
> draws a linear regression line, looks for outliers, discards them, and
> redraws the regression line with the remaining points.
>
> --Summary--
>
> Despite being cheap hardware, and having a slow network stack that suffers
> from high jitter, it's possible for an ESP32 to get a view of UTC typically
> accurate to under 1 millisecond using only an Internet NTP server accessed
> using its built-in WiFi. Though worse than the performance one would expect
> from a desktop computer, it's in the same ballpark, and could be improved
> by investing more time into better outlier rejection.
>
> Regards,
> -Jeremy N3UUO
> _______________________________________________
> time-nuts mailing list -- time-nuts at lists.febo.com -- To unsubscribe send
> an email to time-nuts-leave at lists.febo.com
> To unsubscribe, go to and follow the instructions there.
>