S 2.83 Testing standard software

Initiation responsibility: Head of IT, Head of Specialised Department

Implementation responsibility: Tester

The testing of standard software can be divided up into the sections preparation, implementation and evaluation. The following tasks must be carried out in these sections:

Test preparation

determining the test methods for the individual tests (test types, processes and tools
creating test data and test cases
establishing the necessary test environment

Test implementation

initial tests
functional tests
tests of additional functional features
security-specific tests
pilot application

Test evaluation

The various tasks are described below.

Test preparation

Determining the test methods for the individual tests (test types, processes and tools

Methods for carrying out tests are, for example, statistical analyses, simulation, proof of correctness, symbolic program execution, review, inspection, failure analysis. It should be noted that some of these test methods can only be carried out if the source code is available. The suitable test method must be selected and determined in the preparation stage.

It must be clarified which processes and tools will be used for testing programs and checking documents. Typical processes for testing programs are, for example, black box tests, white box tests or penetration tests. Documents can be checked using informal methods, reviews or checklists, for example.

A black box test is a functionality test without knowledge of the internal program sequences. Here, the program is run with all data types for all test cases with troubleshooting and plausibility checks.

A white box test is a functionality test with disclosure of the internal program sequences, e.g. by source code checks or tracing. White box tests generally go beyond IT baseline protection and can not normally be carried out for standard software as the source code is not disclosed by the manufacturer.

Functionality tests are intended to prove that the test is in accordance with the specification. Using penetration tests, it is intended to determine whether known or assumed vulnerabilities can be exploited in practical operation, for example by attempts to manipulate the security mechanisms or by bypassing security mechanisms by manipulation at the operating system level.

The way the results are to be secured and evaluated should be stipulated, particularly as regards repeating tests It should be clarified which data should be kept during and after the test.

Creating test data and test cases

The preparation of tests also includes the creation of test data. Methods and procedures should be stipulated and described in advance.

A number of test cases in accordance with the testing time must be created for each test. Each of the following categories should be taken into consideration:

Standard cases are cases which are used to test whether the defined functions are implemented correctly. The incoming data are called normal values or limit values. Normal values are data within the valid input area, limit values are threshold data.

Error cases are cases where attempts are made to provoke possible program error messages. The input values which should cause a predetermined error message to occur in the program are called false values.

Exceptional cases are cases where the program has to react differently than to standard cases. It must therefore be checked whether the program recognises these as such and then processes them correctly.

Examples:

If the input parameters can be between 1 and 365, tests are to be carried out with false values (e.g. 0 or 1000), the limit values 1 and 365 and with normal values between 1 and 365.
An appointment planning program should take national holidays into consideration. A special case is when a certain day is a holiday in all federal states except one. The program must then react appropriately for this federal state and this day.

In the event that it is too time-consuming or difficult to create test data, anonymous actual values can be used for the test. For reasons of

confidentiality protection, actual data must be made anonymous. It should be ensured that this anonymous data does not cover all limit values and exceptional cases, so that these have to be created separately.

In addition to the test data, all types of possible user errors should be taken into consideration. Particularly difficult are all user reactions which are not planned in the program sequence and which are thus not correctly rejected.

Establishing the necessary test environment

The test environment described in the test plan must be established and the products to be tested installed. The components used should be identified and their configuration described. In the event that deviations from the described configuration arise when installing the product, this should be documented.

Test implementation

The test must be implemented based on the test plan. Each action, together with the test results, must be adequately documented and evaluated. In particular, if errors appear, they must be documented in such a way that they can be reproduced. Operating parameters suited to subsequent production operations must be determined and recorded to enable installation instructions to be drawn up at a later stage.

If additional functions are detected in the product which are not listed in the requirements catalogue but can nevertheless be of use, a short test for them must be carried out at the very least. If it becomes apparent that this function is of particular importance for later operations, they must be tested in full. For the additional testing time incurred, application must be made if necessary for an extension of the time limit to the person responsible. The test results must be included in the overall evaluation.

If, when processing individual test contents, it becomes apparent that one or several requirements of the requirements catalogue were not sufficiently specific, they must be put in more specific terms if necessary.

Example: In the requirements catalogue, encryption is demanded to safeguard the confidentiality of the data to be processed. During testing, it has become apparent that offline encryption is unsuitable for the intended purpose. An addition must therefore be made to the requirements catalogue with regard to online encryption. (Offline encryption must be initiated by the user and each of the elements to be encrypted must be specified; online encryption is carried out in a transparent way on behalf of the user with pre-set parameters.)

Initial tests

Before all other tests, the following basic aspects must first be tested, as any failure in these initial tests will lead to direct actions or the stopping of the test:

The absence of computer viruses in the product must be checked by a current virus search program.
It must be established in an installation test whether the product can be installed simply, completely and comprehensibly for the intended purpose at a later stage. Likewise, there must also be a check on how the product is completely deinstalled.
The running capabilities of the product must be checked in the planned application environment; this comprises in particular a check of screen editing, printer output, mouse support, networking capability etc.
The completeness of the product (programs and manuals) must be checked, e.g. by comparing with the inventory list, the product description or similar.
Short tests of program functions, which are not explicitly mentioned in the requirements, should be performed with regard to function, plausibility, absence of errors etc.

Functional tests

The functional requirements which were placed on the product in the requirements catalogue must be examined in terms of the following aspects:

Existence of the function by calling up in the program and evaluation of the items of program documentation.
Absence of errors and/or correctness of the function
To ensure the absence of errors and/or correctness of the function, depending on the test depth various test procedures should be used during the check such as black box tests, white box tests or simulated production operations.
The test data and test cases created in the initial phase are used in the functionality test. During the functionality tests, it is necessary to compare the test results with the specified requirements. In addition, it should be checked how the program reacts in the case of faulty input parameters or faulty operation. The function must also be tested with the limit values of the intervals of input parameters and with exceptional cases. They must be detected accordingly and handled correctly.
Suitability of the function
The suitability of the function is distinguished by the fact that the function
- actually fulfils the task to the required extent and in an efficient manner and
- can be integrated easily into normal work processes.
If the suitability of the function is not obvious, the solution is to test this in a simulated production operation, but still in the test environment.
Consistency
The consistency of the individual functions must be checked, in each case between the requirements catalogue, the documentation and the program. Any discrepancies must be documented. Discrepancies between the documentation and the program must be recorded in such a way that they can be incorporated into the additions to the documentation when the product is used at a later stage.

Tests of additional functional features

The additional features specified in the Requirements Catalogue besides the security-specific features and the functional features must also be checked:

Performance
The runtime response should be determined for all planned configurations of the product. In order to test performance adequately, general tests in which production operations are simulated, or a pilot application with selected users, are useful. It must be established whether the set performance requirements are being met.
Reliability
The behaviour during accidentally or deliberately caused system crashes ("crash test") must be analysed and it must be established what damage results from this. A record must be made of whether the product can be properly and correctly restarted following system crashes. It must also be checked as to whether there can be direct access to data bases independent of the regular program function. In many cases, such access can lead to the loss of data and should be prevented by the product. It should also be recorded whether the program supports possibilities of reversing "critical actions" (e.g. deleting, formatting).
User-friendliness
Whether the product is user-friendly depends, to a particular degree, on the subjective feeling of the tester. However, the following aspects can serve as orientation when making the assessment:
- technology of menu surfaces (pull-down menus, scrolling, drag & drop, etc.),
- design of menu surfaces (e.g. uniformity, comprehensibility, menu guidance),
- keyboard layout,
- error messages,
- trouble-free access to interfaces (batch operation, communication, etc.),
- readability of the user documentation,
- help functions.
The analysis of user-friendliness must describe possible modes of operation of the product, including operation following handling or operating errors, and their consequences and implications for maintaining secure operation.
Maintainability
Personnel and financial expenditure on the maintenance and care of the product should be determined during testing. This can be estimated with the aid, for example, of reference factors such as other reference installations, tests in journals or using the installation expenditure determined during testing. To do this, the number of manual interventions which were necessary during installation to arrive at the configuration sought must be documented. If experience with preceding versions of the tested product has already been made, it should be analysed how expensive their maintenance was.
Enquiries should be made regarding the extent to which support is offered by the manufacturer or vendor and under what conditions. If a hotline is offered by the manufacturer or vendor, its accessibility and quality should also be considered.
Documentation
The existing documentation must be checked as to whether it is complete, correct and consistent. In addition to this, it should be understandable, clear, error-free and easy-to-follow.
It must further be checked as to whether it is adequate for secure use and configuration. All security-related functions must be described.

In addition, the following additional points of the requirements catalogue must be tested:

compatibility requirements
interoperability
compliance with standards
adherence to internal rules and legal provisions
software quality

Security-specific tests

If security-specific requirements were placed on the product, the following aspects must be examined in addition to the checks and tests mentioned above:

effectiveness and correctness of the security functions
strength of the security mechanisms and
absolute necessity and unavoidability of the security mechanisms.

As the basis for a security check, the Manual for the Evaluation of the Security of Information Technology Systems (ITSEM) could, for example, be consulted. It describes many of the procedures shown below. The additional comments are an aid to orientation and serve as an introduction to the topic.

At the outset, it must first be demonstrated by functional tests that the product supplies the required security functions.

Following this, it must be checked whether all the required security mechanisms were mentioned in the requirements catalogue and, if necessary, this must be amended. In order to confirm or reject the minimum strength of the mechanisms, penetration tests must be carried out. Penetration tests must be performed after all other tests, as indications of potential vulnerabilities can arise out of these tests.

The test object or the test environment can be damaged or impaired by penetration tests. To ensure that such damage does not have any impact, data backups should be made before penetration tests are carried out.

Penetration tests can be supported by the use of security configuration and logging tools. These tools examine a system configuration and search for common vulnerabilities such as, for example, generally legible files and missing passwords.

Using penetration tests, the product should be examined for design vulnerabilities by employing the same methods a potential 'attacker' would use to exploit vulnerabilities such as, for example

changing the pre-defined command sequence
executing an additional function
direct or indirect reading, writing or modification of internal data
executing data the execution of which is not planned
using a function in an unexpected context or for an unexpected purpose
activating the error recovery
using the delay between the time of checking and the time of use
breaking the sequence by interrupts or
generating an unexpected input for a function.

The mechanism strengths are defined using the terms specialised knowledge, opportunities and operating resources. These are explained in more detail in ITSEM. For example, the following rules can be used for defining the mechanism strength.

If the mechanism can be mastered by a lay person alone within minutes, it cannot even be classified as low.
If a successful 'attack' can be carried out by anyone except a lay person, the mechanism must be classified as low.
If an expert is required for a successful 'attack' and the expert takes some days with the available equipment, the mechanism must be classified as medium.
If the mechanism can only be mastered by an expert with special equipment and the expert takes months to do it and has to come to a secret arrangement with a system manager, it must be classified as high.

It must be ensured that the tests carried out cover all security-specific functions. It is important to note that only errors or differences from the specifications can ever be determined by testing, never the absence of errors.

Typical aspects of investigation can be shown by a number of examples:

Password protection:

Are there passwords which have been pre-set by the manufacturer? Typical examples of such passwords are the product name, the manufacturer's name, "SUPERVISOR", "ADMINISTRATOR", "USER" and "GUEST".
Which file changes if a password was changed? Can this file be replaced by an old version from a data backup to activate old passwords? Are the passwords stored in encrypted form or are they readable in plain text? Is it possible to make changes in this file in order to activate new passwords?
Is access actually blocked following several incorrect password entries?
Are programs offered in magazines or mailboxes which can determine the passwords of the product being examined? Such programs are available for some standard applications.
If files are protected by passwords, is it possible to determine the position at which the password is stored by a comparison of a file before and after the change in the password? Is it possible to enter changes or old values at this point in order to activate known passwords? Are the passwords stored in encrypted form? How is the position allocated if password protection is deactivated?
Can the password testing routine be interrupted? Are there key combinations with which password entry can be bypassed?

Access rights:

In which files are access rights stored and how are they protected?
Can access rights be altered by unauthorised persons?
Can files be inserted using old access rights and which rights are needed for this?
Can the rights of the administrator be restricted so that he/she does not obtain access to the usage or log data?

Data backup:

Can data backups which have been created be reconstructed without any problems?
Can data backups be protected by a password? If so, can the password investigation approaches described above be used?

Encryption:

Does the product offer the possibility of encrypting files or data backups?
Are several different encryption algorithms offered? In this connection, generally speaking, the following rule of thumb should be observed: "The quicker an encryption algorithm produced in software is, the more insecure it is."
Where are the keys used for encryption and decryption stored?

In the case of local storage, there must be an examination of whether these keys are password-protected or protected by a second encryption with a further key. In the case of password protection, the above points must be taken into account. In the case of over-encryption, consideration must be given to how the accompanying key is protected.

In addition, the following points can be considered: Which file changes if a password was changed? By comparing this file before and after the change in the key, the point can be determined at which this key is stored. Is it possible to make changes at this point to activate new keys which are then employed by the user, without the latter noticing the compromising?
Are there keys which have been pre-set by the manufacturer which have to be changed before using the program for the first time?
What happens if an incorrect key is entered during decryption?
Following the encryption of a file, is the unencrypted variant deleted? If so, is it reliably overwritten? Is it checked as to whether the encryption was successful prior to deletion?

Logging:

Is access to logging data denied to unauthorised persons?
Are the activities to be logged fully recorded?
Does the administrator have the option, by virtue of his/her privileged rights, of obtaining access to protocol data without authorisation and unobserved, or can he/she deactivate the logging without being noticed?
How does the program react if the logging memory becomes overcrowded?

In addition to this, it must be ascertained whether, as a result of the new product, security features will be circumvented elsewhere. Example: The product to be tested offers an interface to the operating system environment; previously however, the IT system was configured in such a way that no such interfaces existed.

Pilot application:

Following the conclusion of all other tests, a pilot application, i.e. use under real conditions, might still be considered necessary.

If the test is carried out in the production environment using actual data, the correct and error-free operating method of the program must have been confirmed beforehand with a sufficient number of tests, in order not to jeopardise the availability and integrity of the production environment. For example, the product may be installed at the premises of selected users who will then use it for a set period in actual production conditions.

Test evaluation

Using the decisive criteria specified, the test results must be assessed and all results must be assembled and submitted along with the test documentation to the Purchasing Department or the person responsible for the test.

With the aid of the test results, a final judgement should be made regarding a product to be procured. If no product has passed the test, consideration must be given as to whether a new survey of the market should be undertaken, whether the requirements set were too high and must be changed, or whether procurement must be dispensed with at this time.

Example:

Using the example of a compression program, one possibility is now described of evaluating test results. Four products were tested and assessed in accordance with the three-point scale derived from S 2.82 Developing a test plan for standard software.

Property	Necessary/Desirable	Significance	Product 1	Product 2	Product 3	Product 4
Correct compression and decompression	N	10	2	2	Y	0
Detection of bit errors in a compressed file	N	10	2	2	N	2
Deletion of files only after successful compression	N	10	2	2	Y	2
DOS PC, 80486, 8 MB	N	10	2	2	Y	2
Windows-compatible	D	2	0	2	Y	2
Throughput of over 1 MB/s at 50 MHz	D	4	2	2	Y	2
Compression rate over 40%	D	4	2	1	N	0
Online help system function	D	3	0	0	N	2
Password protection for compressed files	D	2	2	1	N	2
Assessment			100	98	K.O.	K.O.
Pricing(maximum cost of 50.00 euro per licence)			49.00 euro	25.00 euro		39.00 euro

Table: Test plan for standard software

Product 3 had already failed at the preselection stage and was therefore not tested.

Product 4 failed in the test section "correct compression and decompression", because the performance of the feature was assessed with a 0, although it is a necessary feature.

In calculating the assessment scores for products 1 and 2, the grades were used as multipliers for the respective significance coefficient and the total finally arrived at:

Product 1: 10*2+10*2+10*2+10*2+2*0+4*2+4*2+2*2 = 120
Product 2: 10*2+10*2+10*2+10*2+2*2+4*2+4*1+2*1 = 118

Following the test evaluation, product 1 is thus in first place, but is closely followed by product 2. The decision in favour of a product must now be made by the Purchasing Department on the basis of the test results and the price-performance ratio resulting from them.

Review questions:

Are the test methods for the individual tests with test types, processes and tools determined at the test preparation stage?
Are standard, error and exceptional cases taken into consideration in the scope of the checks?
When using actual data for test purposes: Is the actual data made anonymous for the tests?
Is the installation and configuration of the test environment documented?
Are the tests implemented based on test plans?
Are functional tests which also check incorrect input parameters carried out?
Are security-specific tests which also include penetration tests carried out?
Is there a documentation of the tests assessing all test results on the basis of the decisive criteria?