This is perhaps not a topic people consider very often, but I have seen this recently in a couple of cases - running ntpd in slew mode on a VM can cause the time to be wrong by several seconds, or even several minutes, so I thought I would write a blog post.
Case 1: Incorrect time after migrating VMs
In the first case the problem was that, when migrating VMs (using Red Hat Virtualization), the time on the VM would be off by several seconds. This was causing hundreds of internal customer tickets to be generated. Here is the initial ntpq output:
# ntpq -np remote refid st t when poll reach delay offset jitter ============================================================================ *10.229.70.27 10.229.70.221 2 u 487 1024 377 1.070 4305.55 176.409 +10.249.70.27 169.254.0.1 3 u 1012 1024 377 2.170 4449.82 106.301
As you can see, the clock was off by over four seconds (offsets of 4305 and 4449 ms). This was not opened as a TAM case, but it was causing problems for the customer so I investigated.
First, there are some existing knowledge base documents on how to sync the clock on VMs, such as:
Best practices for accurate timekeeping for Red Hat Enterprise Linux running on Red Hat Virtualization https://access.redhat.com/solutions/27865 (It essentially says 'run ntpd'.)
How to troubleshoot NTP issues https://access.redhat.com/solutions/64868
Understanding Red Hat Enterprise Linux System Clocks And Time Protocol Implementations https://access.redhat.com/articles/1456843
When looking through their sosreport logs (from two affected clients), I noticed a few things:
The systems were using slew mode (ntpd -x).
The systems were using only two NTP servers.
The systems were not using 'tinker panic 0'.
The systems were bypassing the server selection algorithms by using the 'true' option.
ntpdate was enabled.
There were some other configuration issues (e.g. SYNC_HWCLOCK=yes in /etc/sysconfig/ntp. This is an option to ntpdate.)
Let’s review each of these items.
1. The systems were using slew mode (ntpd -x)
Of these items, the most likely cause of the problem was slew mode. With slew mode, the time is slewed, not stepped, even if a large (up to 600s) offset is seen. When migrating VMs, the VM is paused, not restarted, so there can be a time difference of several seconds or more when the VM resumes. Without slew mode, this time difference will be stepped within 15 minutes. From the man page:
$ man ntpd
Under ordinary conditions, ntpd slews the clock so that the time is effectively continuous and never runs backwards. If due to extreme network congestion an error spike exceeds the step threshold, by default 128 ms, the spike is discarded. However, if the error persists for more than the stepout threshold, by default 900 s, the system clock is stepped to the correct value. In practice the need for a step has is extremely rare and almost always the result of a hardware failure. With the -x option the step threshold is increased to 600 s.
-x Normally, the time is slewed if the offset is less than the step threshold, which is 128 ms by default, and stepped if above the threshold. This option sets the threshold to 600 s, which is well within the accuracy window to set the clock manually.
So with -x, if the time is off by up to 600s, ntpd will gradually adjust it, rather than step it. If the offset exceeds 600s, ntpd will step it. Without -x, ntpd will step the time if the offset is greater than 128ms. In both cases (with and without -x), ntpd will only step the time after the step threshold has been exceeded (default 900s). To force ntp to step sooner, you can set the stepout option.
You can set 'tinker panic' and 'tinker stepout' options in /etc/ntp.conf:
tinker [ allan allan | dispersion dispersion | freq freq | huffpuff huffpuff | panic panic | step step | stepout stepout ] panic panic
Specifies the panic threshold in seconds with default 1000 s. If set to zero, the panic sanity check is disabled and a clock offset of any value will be accepted.
Specifies the stepout threshold in seconds. The default without this command is 900 s.
After removing -x, the offset was much better (0.3 ms):
# ntpq -np remote refid st t when poll reach delay offset jitter ============================================================================ +10.229.70.27 10.229.70.221 2 u 4 8 1 1.512 0.311 0.076 *10.249.70.27 169.254.0.1 3 u 4 8 1 0.385 0.048 0.086
What does "ntpd -x" mean? Can I run NTP in slewmode?
The other issues were secondary but still fall under the category of ‘best practices’.
2. The systems were using only two NTP servers
Four NTP servers are recommended. See http://support.ntp.org/bin/view/Support/SelectingOffsiteNTPServers
"With two, it is impossible to tell which one is better, because you don't have any other references to compare them with. This is actually the worst possible configuration -- you'd be better off using just one upstream time server and letting the clocks run free if that upstream were to die or become unreachable."
Best practices for NTP
"Use at least 4 NTP servers"
3. The systems were not using 'tinker panic 0'
VMs should use ‘tinker panic 0’ because they can be paused.
The NTP configuration directive `tinker panic 0` is recommended for virtual machines, but is it also recommended for physical machines?
"The tinker panic value of 0 tells NTP that no matter what the time offset is, not to panic and exit. This is recommended for virtual machines because virtual machines have no physical clock and can be paused at anytime and started back up hours later. "
4. The systems were bypassing the server selection algorithms by using the 'true' option
Ntpd has built-in algorithms to select the best clock source. By using the ‘true’ option, these algorithms are bypassed.
$ man ntp.conf
true Mark the association to assume truechimer status; that is, always survive the selection and clustering algorithms. This option can be used with any association, but is most useful for reference clocks with large jitter on the serial port and precision pulse-per-second (PPS) signals. Caution: this option defeats the algorithms designed to cast out falsetickers and can allow these sources to set the system clock. This option is valid only with the server and peer commands.
Can NTP be used with 2 NTP servers, specifying one as primary and another as backup?
"If two NTP servers are required for redundancy, one server can be specified as the "primary" NTP server by using the true option. This will force that server to always be selected and for NTP client to follow it blindly. Note that this option defeats the purpose of NTP's timesource selection algorithms. If the specified time source is unstable, the system will not able to identify the problem."
5. ntpdate was enabled
Red Hat Enterprise Linux 5 runs ntpdate as part of the ntpd init script. Red Hat Enterprise Linux 6 and 7 have (deprecated) ntpdate init/systemd scripts which will run on startup, before ntpd starts. This will normally step the time to be in sync with the upstream time server. As you can see (Red Hat Enterprise Linux 6 shown here), ntpdate (57) will run before ntpd (58):
# grep chkconfig ntpdate ntpd ntpdate:# chkconfig: - 57 75 ntpd:# chkconfig: - 58 74
How do I run ntpdate on system startup?
So, running ntpdate is supported. However, be aware that a VM can be paused, so relying on ntpdate with ntpd running in slew mode can cause large time drifts. In this case (migrating VMs), the systems were not restarted and so ntpdate did not run.
Case 2: Large time offset/long recovery time
In the second case, the VMs were running on VMware, and the offset was much worse (over 120 sec). They were also running ntpd in slew mode.
# ntpq -np remote refid st t when poll reach delay offset jitter ============================================================================ +192.168.1.2 220.127.116.11 3 u 47 64 377 0.137 123638. 94.439 *192.168.1.1 18.104.22.168 3 u 13 64 377 0.115 123685. 81.705
Because the slewing rate is limited to 0.5ms/s, with this offset it would take almost three days for the clock to synchronize.
ntpd - Network Time Protocol (NTP) daemon
"The maximum slew rate possible is limited to 500 parts-per-million (PPM) by the Unix kernel. As a result, the clock can take 2000 s for each second the clock is outside the acceptable range."
See also `man ntpd`.
In addition to configuring ntpd according to best practices (number of NTP servers, use of 'tinker panic 0', enable ntpd server selection, etc.), be aware that running ntpd in slew mode on VMs can cause large time drifts. It can take almost 14 days to synchronize a clock off by the maximum offset allowed by slew mode of 600 seconds.
A Red Hat Technical Account Manager (TAM) is a specialized product expert who works collaboratively with IT organizations to strategically plan for successful deployments and help realize optimal performance and growth. The TAM is part of Red Hat’s world class Customer Experience and Engagement organization and provides proactive advice and guidance to help you identify and address potential problems before they occur. Should a problem arise, your TAM will own the issue and engage the best resources to resolve it as quickly as possible with minimal disruption to your business.
Donald Berry is a TAM in the Northeast/Canada region. He has been working with UNIX/Linux in development and support for more than 25 years.