S 2.293 Maintenance of zSeries systems
Initiation responsibility: Head of IT, IT Security Officer
Implementation responsibility: Administrator
The maintenance concept includes the maintenance of the zSeries hardware, the z/OS operating system, the different program products, and the zSeries microcode (firmware). Maintenance refers to the entire lifecycle of a product, from new installation to permanent service, and up to disassembly.
Maintenance of the zSeries hardware
It is recommendable to conclude a service agreement with the manufacturer and/or with partner companies certified by the manufacturer for maintenance of the zSeries hardware. Maintenance may either be performed regularly or becomes necessary when internal inspection programs detect errors and inform the manufacturer or the substitute via RSF (Remote Support Facility). In order to ensure the functionality of the hardware (and also of the basic software), regularly checking the EREP reports (Environmental Record Editing and Printing Program) is recommendable. The information about hardware and software issues contained in the EREP report is provided from the hardware and the z/OS operating system.
Maintenance of the z/OS operating system
The maintenance of a z/OS system, including all subsystems, is extremely complex and therefore requires careful planning. The term maintenance covers the following activities:
- commissioning of a new system
- changes as functional extension or retrofitting of functions
- elimination of reported errors by so-called PTFs (Program Temporary Fixes)
- installation of PTFs as a preventive measure (at this point, PTFs against reported security gaps are of particular importance) based on manufacturer's information
- disassembly of systems
Normally, maintenance cannot be carried out on z/OS operating systems without interrupting operations.
The following recommendations must be taken into consideration when maintaining the z/OS operating system:
Maintenance schedules
Maintenance schedules must be drawn up specifying when changes to the system may be performed. IPL dates (Initial Program Load) must be determined and test scenarios must be developed. This must be coordinated with all persons involved. In order to be able to undo failed changes, if necessary, a fallback concept must be drawn up.
Change management
All changes to definitions of the z/OS operating system (including dynamic changes during productive operations) must be planned and controlled by the change management team. The same applies to new installations.
New installation
A new installation becomes necessary if a z/OS operating system is to be used for the first time or if a new version (and/or a new release) is to replace the existing version. Here, the manufacturer offers different, largely prepared product and system deliveries under the term CustomPac that some of which are free of charge and some of which are available within the framework of service agreements.
SystemPac is part of the CustomPac offer and allows installation of a largely prepared delivery of the z/OS operating system - possibly including some additional products. For new installation, a separate system environment (see below) is required. By using SystemPac, the complexity and thereby also the probability of operating errors can be reduced significantly during new installation. Therefore, it should be considered to rely on the SystemPac mechanism when newly installing z/OS systems. At the same time, the additional costs possibly incurred by the aforementioned must also be taken into account.
Permanent service of the components
The z/OS operating system and its program products require permanent service. Nearly all manufacturers provide patches (known as PTFs in the mainframe system) for their programs, which are designed to eliminate errors. For the z/OS operating system, IBM provides these PTFs using different channels:
- as individual delivery upon request of the customer (e.g. due to an error situation): here, the user must check the general conditions himself/herself, e.g. the dependencies
- as RefreshPac within the framework of preventive maintenance, adapted to the customer's system (pre-checked by IBM) or
- as OMIS delivery (Online Maintenance Information System). OMIS is based on the data of the customer's system and is also checked by IBM in advance.
It must be considered whether preventive maintenance is required to increase the operational reliability or whether PTFs are only to be installed for current errors. Security-relevant patches should be installed preventively and promptly upon release in any case. This particularly applies to systems with internet access. Information regarding security-relevant patches may be requested from IBM.
SMP/E maintenance
As a central maintenance tool, SMP/E must be used, the System ModifikationProgram/Extended. By maintaining the inventory of the software versions in the CSI (Consolidated Software Inventory), it is ensured that all information about modules, versions, and connections of the z/OS operating system is available and therefore errors during patch installation are avoided as far as possible.
Independent Software Vendors
Software products from ISVs (Independent Software Vendors) should also be installed and maintained using SMP/E, if possible. It must be considered whether ISV products are to be installed separately or within the framework of the SystemPac mechanism.
Consolidated Software Inventory
There should be a CSI for the z/OS operating system and/or the CSI(s) should be created as planned in the procedure in the event of a SystemPac installation according to the delivery by IBM. One separate CSI per manufacturer is recommended in order to prevent errors when PTFs have the same names.
USERMODS
Changes performed by the users should only be installed with the help of SMP/E (as USERMODS). This ensures that the user's own changes are not overwritten by the manufacturer's changes without the user being informed about it. The changes must be installed newly and possibly adapted upon every change in the release of the system and/or the modules the changes are installed to. USERMODS should be kept to a minimum, since they entail permanent service efforts.
ACCEPT runs
An ACCEPT run permanently stores a PTF to the system, i.e. it can no longer be removed. An ACCEPT run should therefore only be performed if it is ensured that the PTFs eliminate the determined problems and do not cause any new perceptible errors.
APPLY CHECK
Before installing PTFs, it is recommendable to ensure using an APPLY CHECK SMP/E run that the PTFs also match the currently installed operating system environment and no additional PTFs are required (so-called prerequisites or corequisites).
Testing prior to production
The operational reliability of the delivered PTFs should first be checked on a test system before the PTFs are installed into a production system. During more comprehensive maintenance work (e.g. a so-called refresh including hundreds of PTFs), this procedure must be planned.
Cumulative operating system files
No operating system files should be copied bypassing SMP/E, since this may have adverse effects on the security of the maintenance. Cumulative files include files complied from several files. If cumulative files are to be used, either the inventory maintained in SMP/E must be adapted or a separate procedure must be used in order to guarantee that the inventory is maintained. Therefore, it must be considered whether the additional effort is justified.
Alternative system environment
A second (alternative) system environment is to be used for installing PTFs. Separate hard disks containing a copy of the original system should be used for this. This allows for unproblematic installation during the working times and quick IPL (Initial Program Load) from the modified system residence (the hard disk the boot procedure is initiated from). Moreover, this approach (flip flop approach) supports quick fallback, since the hard disks of the previously active operating system components are still available.
System cloning
The term system cloning describes the process of copying the operating system components to a new set of hard disks taking into consideration the definitions to be changed. It must be considered whether a procedure for system cloning is established in order to be able to quickly and securely install alternative system environments.
Such a procedure must be created independently in several steps, e.g. in the form of a batch job. Here, the use of system variables is very helpful.
Use of symbolic system variables
Symbolic variables should be used as far as possible for the z/OS parameter files. This significantly simplifies system cloning and also avoids erroneous definitions in many cases. Up to 800 variables are available in the z/OS operating system V1R4 and higher.
Documentation
It must be considered whether a reporting system based on SMP/E should be established in order to be able to represent the current status of the entire software of the operating system at any time.
Maintenance of the zSeries microcode (firmware)
In order to eliminate code errors in the firmware, to update the firmware to new versions, and to enable or disable hardware components (e.g. processors, encryption hardware), the manufacturers perform microcode updates. For this, the following information must be taken into account:
Operator control
Updates performed by the manufacturer must only be performed upon consultation with the operator of the zSeries systems and only under the control of the operator's employees.
Manufacturer's declaration
The manufacturer of the operating system software should draw up a declaration of confidentiality.
Remote maintenance
The external access (remote access) must be protected as described in module S 4.4 VPN and specifically in safeguard S 4.207 Use and protection of system-related z/OS terminals. It must be ensured that changes to firmware components are only performed upon consultation with the zSeries system operator.
Disassembly of the z/OS operating system
More detailed information about the disassembly of a z/OS operating system can be found in S 2.297 Deinstallation of z/OS systems.
Review questions:
- Are there maintenance schedules containing defined time windows for changes to zSeries systems?
- Have IPL dates been specified and test scenarios been developed for the z/OS systems in consultation with all persons involved?
- Is there a fallback concept in order to be able to undo failed changes to the z/OS system?
- Are all changes, including the dynamic changes, to definitions of the z/OS operating system planned and controlled by the change management team?
- Is SMP/E used as the central maintenance tool for z/OS systems?
- When using system cloning: Are symbolic variables used as far as possible for the z/OS parameter files?
- Has it been ensured that updates performed by the manufacturer are only performed upon consultation with the operator of the zSeries systems and only under the control of the operator's employees?