S 4.348 Time synchronisation in virtual IT systems

Initiation responsibility: Head of IT

Implementation responsibility: Administrator

Many applications require a correct system time in order to function properly. For file servers, this already starts with the files stored to these servers being equipped with a time stamp. Other systems use the system time differently. Certain authentication systems such as Kerberos or also token-based systems require the correct system time in order to work smoothly. Monitoring systems such as mrtg normally use the system time as an index for their records stored to a database.

For these reasons, it must be taken into consideration that the system time of a virtual IT system is correct at all times. For virtualisation products based on a complete server virtualisation, frequently this is not guaranteed automatically.

The calculation of the system time by counting cycles

State-of-the-art operating systems do not determine the system time by continuously reading the system clock, but by counting processor cycles and comparing these cycles to an external time source. This external time source may be a time server or a hardware clock. The reason for this time determination method, which may seem impractical at first glance, is that state-of-the-art processors need a time source with a higher resolution than the majority of the clocks. This resolution must be in the range of a cycle of a state-of-the-art processor. A conversion factor is obtained by continuously comparing the processor cycles to the reliable time source which allows the processor cycles to be converted into the time. At certain intervals, this conversion factor is corrected by comparing the past cycles to the time source in order to compensate any inaccuracy in the calculation.

The majority of products for server virtualisation dynamically assign processor cycles to the virtual IT systems and therefore the virtual processors depending on their load. Therefore, the counter for the processor cycles runs at different speeds from the virtual machine's point of view. Thus, the algorithm for determining and correcting the time determines different values during each cycle, resulting in the system time in a virtual IT system also appearing to advance at different speeds. This may well cause deviations of several minutes in virtual IT systems so that in extreme cases the counters are over-corrected and the system time of the virtual IT system seems to go backwards.

Normally, the system clock in virtual IT systems with a uniform processor utilisation is sufficiently accurate. Here, it does not matter whether the utilisation is high or low, the uniformity is decisive. Systems with alternating high and low utilisation levels are characterised by the effects described above. In this connection, the operating systems show a very different behaviour depending on their configuration.

Correction methods and their limits

The majority of the virtualisation products are equipped with a mechanism for correcting the system time in the virtual IT systems. This is frequently implemented using the guest tools function. For example, the products of the manufacturers Citrix and VMware include a function for synchronising the system time of the virtual IT systems with the system time of the virtualisation server.

However, these mechanisms are not always sufficient for the applications operated in a virtual IT system, since they normally do not affect all timers of the operating system, but only the so-called time-of-day clock. Furthermore, synchronisation is not performed continuously, but at certain intervals. These intervals are mostly in the range of a very few fractions of a second, but are frequently too long for accurate time adaptation.

This aspect must be observed when operating applications in virtual IT systems. The applications must either make do with a system clock running in a non-uniform manner or configuration changes must be performed to the virtualisation server or the virtual IT system, which increase the accuracy of the system clock of the virtual IT systems.

Such configuration changes consist of performing the query of an external time source more frequently than is the case by default. This may be performed using guest tools if these are equipped with the corresponding configuration option. However, it is also possible to configure the corresponding virtual IT system in such a way that it queries an NTP server more frequently and corrects its system clock this way. This shortens the intervals when the clock is running with an improper speed and the conversion factor for the processor cycles is adapted faster. Normally, it does not make sense to combine these two options, since minor losses of performance must be taken into account otherwise. For Unix operating systems, kernels optimised for virtualisation must frequently be used. The corresponding parameters must be set here depending on the Unix derivatives used, e.g. in the boot loader. It may also be required to create such a kernel in a dedicated manner (self-compiled).

Generally, an approach must be established that ensures problems regarding the synchronicity of the system time can be detected and eliminated before the virtual systems are rolled out. During pilot operations of a new virtual IT system, the system time of the system must be monitored specifically. It must be determined whether the internal clock of the virtual IT system deviates from the actual time. In this case, it must be checked whether this has adverse effects on the application operated in the virtual IT system and corrective actions must be taken, if required. The success of the corrective actions must be reviewed during further pilot operations and even upon transition to productive operations.

Review questions: