Friday, 30 March 2007

NTP offset graphs

I solved a NTP problem a few days ago. I didn't really solve it because instead of fixing Windows 2003 W32Time service we simply substituted it with the Meinberg NTP port.

Graphs created with gnuplot were crucial in convincing the network admin that his windows box was the problem. I was able to show that my machine agreed with the other chimers in the network while the Windows 2003 chimer was jumping back and forth.

The red line represents the time offset to the W2K3 chimer, while the others refer to the remaining NTP chimers. It is clear that the client is struggling to sync with the lower-strata Windows 2003 NTP server but it keeps travelling in time. You can also notice in the graph how the offset to the other servers remains close to zero, indicating they agree on the correct time.

To create these graphs I used the peerstats files the NTP daemon creates in /var/log/ntp (path may be different on your distro or not be enabled at all), a perl script, gnuplot and some patience.

The peerstats files are full of lines like this:
54188 331.225 a.b.c.d 9614 -0.001258450 0.002503000 0.014872032 0.001373005
I was interested in the first, second, third and fifth fields: date in modified Julian Day (MJD), seconds from midnight UTC, peer IP address and offset in seconds.

To convert the MJD to a time format accepted by gnuplot I created the script conv.pl:

#!/usr/bin/perl

use DateTime::Format::Epoch::MJD;
while (<STDIN>) {
chomp;
if ( /^(.*) (.*) (.*)$/) {
my $dt = substr(DateTime::Format::Epoch::MJD->parse_datetime( $1 ), 0,10);
print "$dt,";
print int(($2/(60*60))%24);
print ":";
print int(($2/60)%60);
print":";
print ($2%60);
print " $3\n";
}
}
The script doesn't deserve any style points and it doesn't consider time zones (although the MJD module might; go check yourself if you are worried). You will need the DateTime::Format::Epoch perl module (Debian users have it easy).

After that you can prepare you peerstats file(s) to be used with Gnuplot:
cat peerstats1 | grep 'a.b.c.d' | cut -d" " -f1,2,5 | ./conv.pl > stats1
Your grep parameters may vary but make sure you select only peer at a time because the gnuplot configuration I use below expects only 2 axes (date/time and offset).

The gnuplot configuration looks like this (stats.conf):
# feel free to change image size
set terminal png size 1600,1024
set xdata time
set timefmt "%Y-%m-%d,%H:%M:%S"
set output "ntp-offset.png"
# if you want to change the scale, uncomment
#set yrange [-0.1:0.1]
set grid
set xlabel "Time"
set ylabel "Offset"
set title "Description: Time Offsets"
set key left box
plot "stats1" using 1:2 index 0 title "NTP client 1" with lines, \
"stats2" using 1:2 index 0 title "NTP client 2" with lines, \
Adjust the file to your own needs and run gnuplot:
$ cat stats.conf | gnuplot 
The file ntp-offset.png will be created. Use your favourite program to view the results.