T 2.77 Ineffectual transfer of paper data to electronic archives
Documents initially only present in paper form and therefore to be transferred to an electronic form are regularly stored in many archive systems. This is performed maintaining selected features of the original document. Depending on the purpose of the document, the aforementioned results in different requirements. This may include the conformity of the appearance of the copy with the original, when using an image file, for example. The conformity of text excerpts, e.g. using a text file, or the process of mapping further features, e.g. biometric data or context data, may also be required.
Storage as text or image file alone is not always sufficient to verify that the document is faithful to the original, since both manipulations and errors may occur:
- Text and image processing programs may be used to manipulate existing documents.
- Errors while scanning may result in a falsified semantics of the stored data, which may cause erroneous interpretations and calculations. For example, important parts of the document may be omitted during the scanning procedure.
In some archiving scenarios, the destruction of the documents present in paper form is intended after scanning due to the lack of space. In this case, it must be assumed that it is not possible to directly verify later that the copy is faithful to the original document after the original document has been destroyed.
This means that all features of the original documents required for the purpose of later verification must be captured and stored comprehensibly when converting the document to an electronic form. If features are not taken into consideration or omitted in so doing (e.g. the number of pages of the original document), the validity of the documents may be limited significantly, since collecting the features of the original document later is often not possible.
A poor approach to the conversion of the documents endangers the efficiency and comprehensibility of the subsequent processing step for documents and ultimately the correctness of the archived documents.
Examples:
- The incoming correspondence of a government agency is scanned and stored to an archive for later electronic further processing. However, it is occasionally omitted to scan the reverse side of a letter. Since the incoming correspondence is destroyed upon scanning, the original condition of the letter can no longer be verified.
- When scanning and automatically collecting text, passages are left out or falsified that were not detected correctly by the OCR program (Optical Character Recognition - procedure for recognising text from image files). For example, this may concern text printed in a faint colour or unclear font, but also to handwritten comments in documents or a blurred print image of inkjet printers. Incorrectly recognised invoice amounts (commas not recognised, etc.) also constitute a possible source of error for later misunderstandings.
- Handwritten signatures under documents are scanned as an image. Within the framework of a later dispute regarding the authenticity of document and signature, a graphologic expert opinion may no longer provide an unambiguous statement, since the image file presented may have been manipulated using an image processing program and/or another document may have been copied. Features of the original document such as texture and composition of the paper used or the amount of pressure exerted while making the handwritten signature can no longer be traced.