S 4.221 Parallel Sysplex under z/OS

Initiation responsibility: Head of IT, IT Security Officer

Implementation responsibility: Specialists Responsible, Administrator

A Parallel Sysplex cluster is a cluster consisting of several z/OS systems that appear to the outside as a single system. The z/OS systems can run in this case on one or more logical partitions (LPARs). All systems in the cluster are connected by a Coupling Facility for synchronisation purposes. When using more than one LPAR, a Timer Facility must be used to synchronise the system time (Clock). Additional information on this subject can be found in S 3.39 Introduction to the zSeries platform. Parallel Sysplex clusters are used when there are high requirements placed on the availability and scalability.

All z/OS systems in a Parallel Sysplex cluster are loaded from the same set of hard disks. The individual z/OS operating systems are distinguished using individual system definitions.

The following recommendations should be considered when using Parallel Sysplex clusters:

Use of the Coupling Facility

The Coupling Facility (CF) connects the LPARs. It also provides shared memory that is divided into various objects, which are referred to as Coupling Facility structures. Access to the CF is obtained via XES (Cross-System Extended Services). Three different types of memory can be defined in the CF:

Cache Structures

This structure provides high-performance memory that can be shared by several users. When data is read from the hard disk, a copy of the data is written to the user's own local memory buffer. Furthermore, an additional copy can be placed in the Cache Structure of the Coupling Facility as an option.

List Structures

This structure allows several users to exchange information among each other. The information is made available in lists (message passing) or in queues (queues of work).

Lock Structures

This structure can be used to control the use of resources in the Shared or Exclusive mode across all LPARs.

Operation

If you are considering operating a Parallel Sysplex cluster for availability reasons, for example, then the Coupling Facility should be used with data sharing if possible. This applies at least to JES2/3 (Job Entry Subsystem), RACF (Resource Access Control Facility), VTAM (Virtual Telecommunication Access Method), the System Logger, CICS, IMS, and DB2. It should be examined if it is necessary to design the Coupling Facility redundantly to meet the availability requirements of the overall system.

Coupling Facilities are defined and initialised using the HMC (Host Management Console). Recommendations for the use of this console can be found in S 4.207 Use and protection of system-related z/OS terminals.

Couple datasets

Couple datasets are used by the XCF (Cross-System Coupling Facility) to monitor information on the LPARs, groups or members. All LPARs of the Parallel Sysplex cluster must be able to access these datasets. The use of alternate couple datasets is recommended. In z/OS, the couple datasets must be protected using RACF. Only those employees (and their substitutes) who need to edit the files to perform their work should have write access to them (see S 4.211 Use of the z/OS security system RACF).

The IXCL1DSU utility is available to format the couple datasets. This program should be protected by RACF (PROGRAM class). The administrative utility XCMIAPU allows you to define the Coupling Facility Resource Management (CFRM) policy. It should be protected by a corresponding Facility profile in the RACF so that only authorised personnel can access it. Additional recommendations for protecting critical programs can be found in S 4.215 Protection of z/OS utilities that are critical to security.

Sysplex commands

The z/OS operating system provides the SETXCF system command for administration and monitoring purposes. It supports the following activities, among others:

To protect this command (and all other commands supporting the Parallel Sysplex cluster), corresponding RACF profiles must be defined (see S 4.210 Secure operation of the z/OS operating system).

XCF control

RMF (Resource Measurement Facility) generates an XCF Activity Report. Consideration should be given to using this report to monitor the message traffic between the z/OS operating systems in order to detect communication bottlenecks and deadlock situations early enough and take preventive measures.

Consistent RACF database

A RACF database with uniform RACF definitions should be used for all LPARs in the entire Parallel Sysplex cluster.

Standards

To improve clarity and maintainability, standards should be introduced in the following areas:

Dimensioning

It must be ensured that the caches of the hard disk control units, the work disks, the Coupling Facility structures, and the SPOOL disks are correctly dimensioned. The size of the areas is derived first and foremost from the type and the requirements of the applications running on the Parallel Sysplex cluster. In many cases, the documentation from the software manufacturer will contain information in this regard.

Serialisation

Global Resource Serialization (GRS cluster) must be configured to be able to serialise the system actions. The GRS mode must be defined in the IEASYSnn member of the PARMLIB (RING or STAR mode). If possible, the more modern STAR mode should be selected, since this topology usually offers faster processing due to the resource name lists (RNLs) stored in the couple datasets. The STAR mode also offers more advantages in terms of availability.

Warning: The STAR mode is only possible in conjunction with the Coupling Facility.

High availability through redundancy

Where the availability requirements are high or very high, then it should be examined if the use of the following redundancy mechanisms is appropriate:

Additional information can be found in S 6.93 Contingency planning for z/OS systems.

Hard disk access

The following recommendations relating to hard disk access should be considered:

Symbolic variables

Symbolic variables should be used whenever possible in the PARMLIB definitions. This helps to avoid errors in the system administration and makes system cloning easier.

System Logger

The System Logger should be used with Staging Dataset. (In the event of an error, other systems in the cluster will access these datasets).

Reducing the number of console messages

To reduce and keep manageable the number of console messages, it is recommended to enable message filtering (see S 4.210 Secure operation of the z/OS operating system). This is particularly important, because all messages from all z/OS operating systems in a Parallel Sysplex cluster are displayed on a single MVS console.

Review questions: