S 2.258 Consistent indexing of documents during archiving

Initiation responsibility: Head of IT, Archive Administrator

Implementation responsibility: Archive Administrator, Head of IT, Administrator

When operating the archive, it is important to index all documents and datasets in an unambiguous manner in order to be able to properly find them during later archive queries. Additionally, archive systems offer search query options. Since a full-text search may take a very long time depending on the type and the extent of the archived data, archive systems store a separate dataset containing index information to a separate search database for each document. The structure and extent of the index information can normally be configured and should have the following properties:

As a matter of principle, these parameters must be defined before commissioning the archive. Nevertheless, it may become necessary to change the properties over the course of time. Depending on the extent and type of changes to the index data, this may require very time-consuming re-indexing of the archive databases.

The specific context for individual documents to be archived may be generated differently. Here, three procedures must be differentiated:

The selection of the procedure depends on the data volume to be expected. If individual documents are archived irregularly, a manual procedure based on the specific specifications for the generation of a context is sufficient.

If large data volumes are archived regularly, a semi-automatic procedure should be selected for generating the index data. This provides the option of manually controlling and correcting this information before document and document index are archived and might no longer be changeable.

During fully automatic generation of index data, errors cannot be detected and/or corrected. In this case, it is not possible to detect or rule out a possible erroneous assignment of documents to be archived, for example to business processes. Therefore, this procedure should only be used if all documents are structured in such a way that all index data can be extracted without any doubt and reliable in any case.

Review questions: