S 4.64 Verification of data before transmission / elimination of residual information
Initiation responsibility: IT Security Officer, Head of IT
Implementation responsibility: User, Administrator
Before a file is sent via e-mail, exchanged using a data medium or published on a web server, the file should be checked to see if it contains residual information not intended for the general public. Such residual information can come from a variety of sources, which means a corresponding variety of measures need to be taken to eliminate it. The most common sources of such residual information are described in the following.
In general, the files generated by standard software such as word processing and spreadsheet programs should be checked to see which additional information is stored in the files. Users are aware of some of the information stored in the file, but may be not aware about the rest.
Before transferring files, they should at least be checked for the presence of additional, undesired information on a random basis. A different editor than the one with which the file was originally created should be used for this purpose.
In this case, it must be noted that not all residual information can simply be deleted without destroying the file format. For example, if certain bytes are deleted from a file generated by a word processor, the word processing program might no longer be able to recognise the file format under certain circumstances. To eliminate residual information,
- the file can be stored in a different file format, e.g. as "text only" or as an HTML file,
- the user data can be copied in a second instance of the same standard software package running on an IT system on which no other application is running. This is especially recommended for files with a long history of changes.
To prevent passing on information that was originally added on purpose by the creator of the document but whose presence was later forgotten, i.e. text in a "hidden" format, it may make sense to print out the file. When printing out the file, all options causing text in hidden formats to be printed should be enabled.
Residual information / slack bytes
When exchanging data media, slack space can be a problem. Every operating system defines a smallest possible physical memory unit of a specific size. In DOS, this unit is one sector and has a size of 512 bytes. In Unix systems, the smallest unit is called a block, and its size depends on the version of Unix used. In DOS, the individual sectors of a partition are grouped logically into clusters. The number of sectors in a cluster depends on the size of the partition. When a file is opened, one or more clusters are allocated to it.
The last cluster is not completely utilised when the size of the file to be stored is not an exact multiple of the cluster size.
This consumes storage space. The average storage space required increases with the cluster size. Since the cluster size in turn increases with the partition size, the partitions should not be too large. For example, for a partition size of 1024 to 2047 MB, each cluster is 32 KB in size. This results in an average of 16 KB of unused storage space for each file.
Another problem (in the case of DOS-based operating systems) is that random data from memory is stored in the remaining bytes of the last cluster or block, which are referred to as slack bytes. Slack bytes can contain random data, information on the file structure, but also passwords. Depending on the cluster size, a file may also be padded with slack bytes when it is copied from one data medium to another.
Before transferring a file, it should be ensured that it does not contain any slack bytes. This can be checked using a suitable editor (e.g. a hex editor).
In addition, many Windows applications pose a problem because they do not completely fill in the memory allocated to the program to edit the file data with application data.
Hidden text / comments
A file can contain text passages that are formatted to be "hidden". Some programs also offer the ability to add comments, which often do not show up on printouts or monitors. Such text passages can include comments not intended to be seen by a recipient. For this reason, such additional information should be deleted from the files before they are transferred to external parties.
Marking changes
When editing files, it is sometimes necessary to mark the changes made to the files. Since these marks might not show up on printouts or on the monitor, the files should also be checked to see if they contain any change marks before being transferred.
Version management
Almost all currently available office programs allow you to store several different versions of a document in a single file. This makes it possible to restore an earlier version of a document when necessary. However, this can also lead to very large files, for example when the documents contain graphic objects. The "Automatically save version on close" option should never be enabled because this option also stores the entire previous version of the file every time it is closed.
File attributes
The file attributes or file-info are stored in the file information. Depending on the application used, this file information can include the title, directory path, version, creator (and not only the person who signed it), comments, editing times, the date of last printout, document name and document description. Some of this information is generated by the program itself and cannot be influenced by the person editing the file. Other data needs to be entered manually. Before transferring a file to an external party, the file should be checked for additional information of this type.
Quick save functions
Word processing programs use a quick save function so that only the changes made since the last save need to be stored instead of the entire document. The quick save procedure therefore takes less time than storing the entire document. However, a full save requires less storage space on the hard disk than a fast save. The main disadvantage of the quick save function is that, under some circumstances, the file might still contain text fragments that were removed during the editing process. For this reason, the quick save option should remain disabled in general.
If a user decides to use the quick save option in spite of this, then the user should always perform a full save in the following situations:
- whenever the user is finished editing a document,
- before starting another application that requires a large amount of storage space,
- before the document text is transferred to another application,
- before converting the document to a different file format, and
- before sending the document via e-mail or exchanging it using a data medium.
Review questions:
- Are the users informed with respect to the dangers of residual and additional information in files?
- Are random checks of the files for any residual information contained therein carried out?
- Is the additional information of files by standard software determined and checked before transferring the files?
- Before transferring the files, is it ensured that they do not contain any slack bytes?
- Is storing different versions of a document in a file avoided?