S 6.138 Drawing up a business continuity plan for virtualisation component failure

Initiation responsibility: Head of IT, IT Security Officer

Implementation responsibility: Administrator

The failure of virtualisation servers normally has wide-ranging consequences for the information system. This is due to the fact that not only the virtualisation component itself is affected by the failure, but also all virtualised IT systems operated on the components.

Therefore, the failure of a virtualisation component must not be considered in an isolated manner. Within the framework of planning the use of virtualisation of IT systems in computer centres, it must be taken into consideration that the extent of the damage caused by a failure also increases due to the consolidation effects aimed at regarding the use of the hardware. This extent of damage is higher the stronger the effects of the consolidation effects. Therefore, the protection requirements of the entirety of the virtual IT systems must be mapped to the protection requirements of the virtualisation components. In so doing, the maximum principle and the accumulation principle must be taken into consideration.

Moreover, it is usually not sufficient to only consider the failure of virtualisation servers virtualised IT systems are operated on. Additional IT systems required for operating the virtualisation servers must be incorporated. The failure of these systems may limit the availability of the virtualisation systems. Therefore, an approach regarding the failure of the following systems, if applicable, must be defined:

Depending on how the virtualisation systems have been integrated into the information system, additional systems such as directory services and name resolution services must be considered as well.

Since infrastructure services such as directory services or name resolution services may also be executed on virtualised IT systems, it is possible that the failure of one or several virtualisation components results in a much more complex situation. For example, restarting a significantly virtualised computer centre requires detailed planning due to the service dependencies common in such computer centres.

Generally, the following aspects must be taken into consideration:

Various scenarios where the virtualisation systems or parts thereof have been compromised should be examined within the framework of contingency planning. For these scenarios, there must be a precise description as to which reactions are required and which actions must be executed. The procedure should be drilled regularly.

Timely contingency planning containing specific instructions that may also be followed by personnel who are not familiar with the administration of the system in detail may lessen the consequences in the event of damage. The corresponding documents for emergency situations must be available to authorised persons. However, since these documents contain important information, they must be stored securely.

Each of the following emergency situations should be examined:

Attack

If attacks on the virtualisation systems have been discovered, it must not be assumed that the attacks were restricted to the virtualisation systems themselves. Moreover, it must be checked whether the virtual IT systems operated on the virtualisation systems have been compromised. In so doing, it must be taken into consideration that malware (backdoors, Trojan horses) may have been installed on the virtualisation servers themselves, but also on the virtual IT systems. Moreover, it is possible that undesired communication paths have been opened via the network configuration of the virtualisation servers. Furthermore, virtual IT systems may have been copied.

In order to delete such malware reliably, it is recommendable to completely restore the virtualisation components. The created data backups, but also the documentation of the system configuration and the installation instructions may be used to this end. If the virtualisation environment used is equipped with a user administration for controlling administrative accesses, the user accounts, particularly those of the super users, must be checked for proper group memberships. All passwords should be changed in order to reduce the chances of success for follow-up attacks.

The safeguards described for virtualised IT systems that have been operated on the compromised virtualisation servers in the corresponding business continuity plans should be performed for these systems.

Theft of (physical) virtualisation servers

When virtualisation servers have been stolen, all accounts for administrating the virtualisation servers must be provided with new passwords. It must be taken into account that virtual IT systems have also been stolen together with the virtualisation server, particularly if these were stored to local hard disks of the virtualisation server. Even if this is not the case, it must be assumed that the thief gained knowledge of large parts of the system configuration of the virtual IT systems and the virtualisation infrastructure in the computer centre. Therefore, the extent to which improvements or changes to the virtualisation infrastructure may contribute to the resistance of the infrastructure against future attacks must be verified. When in doubt, the entire virtual infrastructure should be re-designed.

Theft of virtual IT systems

Normally, no physical access to the computer centre is required in order to steal a virtual IT system. An attacker may copy virtual IT systems using the functions of the virtualisation servers, for example. All an attacker needs for this is network access in order to be able to access the storage resources the virtual IT systems are stored to.

Preventively, safeguards making these options more difficult must be developed (S 2.477 Planning a virtual infrastructure, S 4.349 Secure operation of virtual infrastructures). Furthermore, the extent to which such attacks can be detected must be checked.

Therefore, the contingency planning for virtual IT systems should include regulations describing the procedure after such a theft.

Misconfigurations

Misconfigurations of virtualisation servers may have wide-ranging negative consequences for computer centre operations. Therefore, the virtualisation software must be checked regularly for misconfigurations within the framework of contingency planning. If such misconfigurations are discovered, their extent must be assessed. Here, it must be checked in particular whether virtual IT systems are affected by the misconfigurations.

The required changes for eliminating errors in the configuration can be performed directly depending on the degree of severity. However, it must be taken into consideration that virtual IT systems could possibly be affected adversely during such changes. Therefore, it may be necessary to shut down the virtual IT systems prior to performing configuration changes to the virtualisation systems.

Failures due to force majeure

The threats posed by force majeure, e.g. by earthquakes, flooding, fire, storm damage, and cable damage, may have adverse effects on the availability of the virtualisation servers. Adequate safeguards to increase the availability must be taken into consideration, e.g. through the use of redundant communication links of the IT systems.

Review questions: