[time-nuts] Cheap jitter measurements

Wed Apr 11 23:01:02 UTC 2018

kb8tq at n1k.org said:
> Except thatâ€™s not the way most timers run. The silicon needed to get a
> programable  divider to work at 2.4 GHz is expensive. If you dig into the
> hardware descriptions,  the clock tree feeds something much slower to the
> â€œtop endâ€ of the typical timer in a CPU or MCU. The exception is the high
> perf timers in some of the Intel chips.  There the issue is getting them to
> relate to anything â€œoutsideâ€ the chip.

I think I got started in this area back in the early DEC Alpha days.  They 
had a register that counted raw clock cycles.  Simple.  I got stuck thinking 
that was the obvious/clean way to do things.

Many thanks for giving me a poke to go learn more about this area.

That was back before battery operation was as interesting as it is today.  I 
suspect power is more likely the critical factor.  Half the power goes into 
the low order bit, so counting by 4 every 4th cycle rather than 1 every cycle 
saves 3/4 of the power.

> That may be what the kernel does, but it implements the result as a drop /
> add to a counter.  

If the source of time is a register counting CPU clock ticks, and the CPU 
clock (2 or 3 GHz) is faster than the resolution of the clock (1 ns) it will 
be hard to see any drop/add.  However, if the time register is significantly 
slower, then the drop/add is easy to spot.  But all that is lost in the noise 
of cache misses and such.

Here is a histogram from an Intel Atom running at 1.6 GHz.

First pass, using rpcc.
    cycles      Hits
        24     86932
        36    904825
        48      8011
        60       122
        72         1
       144        11
...
So it looks like the cycle counter gets bumped by 12.  That's a strange 
number.  I suspect it's tangled up with changing the clock speed to save 
power.  There are conflicting interests in this area.  If you want to keep 
time, you need a register than ticks at a constant rate as you change speed.  
If you are doing performance analysis, you want a register than counts cycles 
at whatever speed the CPU is running.  Or maybe I'm confused.

Second pass, using clock_gettime.
      nSec      Hits
       698         2
       768         5
       769         2
       838         3
       908         2
       977         1
       978         3
      1047    237102
      1048    383246
      1117    204072
      1118    172490
      1187       275
      1188       135
      1257       263
      1258        47
      1326         7
      1327       216
...
The clock seems to be ticking in 70ns steps.  That doesn't match 12 clock 
cycles so I assume they are using something else.

>From another system:
Second pass, using clock_gettime.
      nSec      Hits
        19     45693
        20    347538
        21    591129
        22     15284
        23        63
        24        34
        25        32
...
Note that this is 50 times faster than the previous example.

I haven't figured out the kernel and library software for reading the clock.  
There is a special path for some functions like reading the clock that avoids 
the overhead of getting in/out of the kernel.  I assume there is some shared 
memory.
  https://en.wikipedia.org/wiki/VDSO

Again, thanks Bob.

TICC arrived today.

-- 
These are my opinions.  I hate spam.