S 4.210 Secure operation of the z/OS operating system
Initiation responsibility: Head of IT, IT Security Officer
Implementation responsibility: Administrator
A z/OS operating system normally operates autonomously for the most part without requiring intervention by the operating personnel. However, some safeguards must be taken to protect operations if the functionalities of a z/OS operating system must be available without any problems:
Surveillance
HMC control
The HMC (Host Management Console) must be checked regularly for reported errors (hardware, microcode, software). Errors reported to the manufacturer by the RSF function (Remote Support Facility) should be known in the company's organisation before the manufacturer calls.
WTOR monitoring
WTOR messages (Write To Operator with Reply) of the z/OS operating system must be monitored in order to ensure that newly added queries of the operating system can be answered immediately, if required. Analogously, this is also applicable to important WTO messages (Write To Operator) of the operating system or its components that may require prompt reactions.
System Tasks
It must be ensured that all planned System Tasks are active. This can usually be identified by means of certain messages during the start-up or reactions of the respective System Task to queries. Usually, it is not sufficient to only check their presence with the help of display commands, but the reaction of the System Task should be checked as well.
Capacity control
It must be ensured that the capacity limits of the system are not exceeded. This means that the planning specifications should be complied with, which must be checked regularly.
Monitoring the security violations
Checks must be performed to ensure that the security specifications are complied with. Security violations must be reported using the defined mechanisms (see S 2.292 Monitoring of z/OS systems).
System utilisation
The system utilisation must be monitored with the help of suitable means; corrective actions must be taken in the event of overload, e.g. reduction of the JES2/3 initiator (Job Entry Subsystem).
It must be considered whether additional specific monitors are used in addition to the default functions (RMF - Resource Measurement Facility) in order to monitor the system even more efficiently.
Automation system
It must be considered whether an automation function (as an in-house development or finished product) should be used in order to regularly perform the trivial checks of the system. For example, this includes the target-actual comparison of active tasks and active NJE connections, as well as open replies, system performance, JES2/3 queue allocation, etc. This allows for a uniform system alive message instead of many unstructured messages, whereby the control may be facilitated significantly.
If several z/OS systems are monitored centrally by one function, it should be considered to present the exception information (Events) using a console (AlertManagement). Different manufacturers offer corresponding programs within the framework of their automation packages.
Automation batch jobs
It must be considered whether an automation function should be used for controlling the batch jobs. This is indispensable starting from a certain number of batch jobs to be controlled, since consistent monitoring cannot be implemented otherwise. Job schedulers with the power of controlling thousands of batch jobs are available from different manufacturers.
Reducing the number of system messages
The number of system messages should be reduced so that only those messages that are actually important are displayed. The use of message filters is recommendable within the framework of automation functions (MPF - Message Processing Facility).
Focal point concept
If many z/OS operating systems are used, it must be considered to establish a central control location (focal point).
Securing the operating functions
Information security is not a one-time matter, but must be checked repeatedly and also adapted to the circumstances during live operations. Such adaptations during live operations often require security-relevant actions that must be protected accordingly. Therefore, the following recommendations must be taken into consideration for securely operating a z/OS system:
Controlled maintenance work
No maintenance work affecting production must be performed outside of the maintenance window on a running z/OS system, nor should any changes be made outside of the maintenance window. All changes, whether planned or unplanned, must be co-ordinated with the specialists responsible using a change management procedure. The change plan should be archived for tracking purposes.
Software installation by SMP/E
A software installation must only be implemented after registration using the change management procedure. In order to avoid errors, a procedure such as SMP/E (System Management Process Enhanced) must be used for software installation.
Dynamic changes
Today, many security-relevant changes can be performed dynamically, i.e. during live operations, without an IPL (Initial Program Load) being necessary. Dynamic changes to the system must only be performed during planned maintenance work and/or upon application. Particularly security-relevant dynamic commands, e.g. SETAPF, REFRESH LLA, MODIFY, CONFIG, FORCE, or SET, must be protected with the help of corresponding RACF profiles. Only trained personal must be able to execute these.
SDSF
SDSF (System Display and Search Facility) must be protected in such a way that unauthorised persons are not able to misuse any system commands. For example, it must not be possible to enable any number of initiators. Furthermore, the priority control for jobs in the system must be protected in SDSF (assignment of WLM service classes). Users must not be allowed to change the priority of their jobs in order to obtain better performance for themselves, for example.
This recommendation is also applicable analogous to flashers, a JES3 support corresponding to the SDSF functionality in this regard.
Protecting the consoles
The protection of the consoles is described in safeguard S 4.207 Use and protection of system-related z/OS terminals. Corresponding RACF definitions must be established in order to prevent employees from being able to access an EMCS (Extended Multiple Console Support) in an unauthorised manner.
Protecting the MVS commands
z/OS system commands must only be executed by authorised persons. These commands must be protected by the corresponding RACF profiles. It must be defined which employees require the authorisation for certain system commands and which employees may execute these. For example, it must be considered whether tasks are to be stopped and started via Operating alone, for example.
HCD
Certain hardware settings can be defined subsequently during z/OS system operation. This is performed with the help of the HCD process (HardwareConfiguration Definition). However, a new IOCDS (Input/Output Configuration Dataset) should only be activated within the framework of change management.
When defining hardware, it must be ensured that resources are not defined as Shared across several stand-alone systems. For example, it should not be possible to access the same hard disk from two different stand-alone systems. For parallel Sysplex configurations, resource sharing is part of the architecture and therefore is not a problem, if configured properly.
Operation
It should be considered to establish two RACF groups for Operating; one for operators with many years of experience and a second one for new (still inexperienced) employees. All employees should only be granted the rights they require. They be sufficiently trained for their work. Security-critical tasks in particular should be assigned to experienced employees.
Review questions:
- Is the host management console checked regularly for reported errors in z/OS?
- Are WTOR messages monitored in order to ensure that newly added queries of the z/OS operating system are answered immediately, if required?
- Has it been ensured that all planned system tasks of the z/OS system are active?
- Are the system utilisation and the compliance with the planned capacity limits of the z/OS system monitored appropriately?
- Is the compliance with the security specifications for the z/OS system monitored?
- Are z/OS system messages reduced in such a way that only important messages are presented?
- Is maintenance work on the z/OS system coordinated with the specialists responsible using a comprehensible change management procedure?
- Is a procedure such as SMP/E used for software installation under z/OS?
- Are the z/OS system commands protected by corresponding RACF profiles so that only authorised persons are able to execute these commands?
- Have the employees responsible for operating the z/OS system been trained sufficiently?