Documenting and archiving

Marc Groenewegen (marcg@dinkum.nl)

September 14, 2005

Document Information
Organisation Hogeschool voor de Kunsten Utrecht (HKU)
Version 0.3
Status Proposal
Abstract:
This document provides an introduction into a way of working that focuses on software design and knowledge preservation for a team of developers.

Table Of Contents

1 Introduction

2 The software life cycle

3 Minimal requirements for all documents

4 Document types

5 From requirements to implementation: an example

6 Reviews

7 Archiving and version control

8 Doggie bag


1 Introduction

Documenting and archiving are a must for good software design. The following chapters describe what we need in a mature software project for describing and maintaining what we make.

1.1 Disclaimer notice

All contents of this document originate from the author. There is however a fair chance that some content resembles commonly used configuration management knowledge, such as found within CMM and IEEE. One thing I've bluntly taken from Tansley and Hayball is their software life cycle diagram because it elaborates the process so clearly.

1.2 Intended audience

This document is primarily intended to be used by system architects, developers and project managers. At HKU it serves as reading and reference material supporting a documenting course. The intention of the author is that the document will also be useful without additional explanation.

1.3 Why documenting ?

Documenting serves a number of purposes. The most obvious reason for writing down what you make is not to forget the details. But there is more. Consider the following:

1.3.1 Design integrity

Finding design flaws early. This is one of the main advantages of writing down what you are going to make. Especially when your design has been reviewed by others, there is a good chance that you'll find design errors before you started implementing them. Documenting all relevant requirements, details, considerations, decisions and relationships between subsystems helps in finding potential design flaws before they are implemented. The time you spend on finding and correcting errors increases exponentially with time. The later you find an error, the more time it takes to redesign your system and other systems to correct the error. Therefore it is very important to find design errors as soon as possible. Describing your project before you start making it and asking people to judge your design will certainly help.

1.3.2 Reusability

A piece of code that works for one purpose might be useful for a different purpose as well, but when you don't know all details about what it does under all circumstances, what its inputs and outputs are and how it works, it's no good. A piece of documentation describing such details will help you in deciding whether to use the product or e.g. make it yourself.

1.3.3 Maintainability

After your design has been finished and taken into production it's very likely that changes have to be made to it. In many cases, extensions or functional changes are desired and sometimes changes have to be made because of failures or errors. Without documentation it is sometimes impossible to understand all consequences of a change. When correcting errors, documentation is often necessary to provide details about the system or the relations between different subsystems.

1.3.4 Information exchange

Documentation helps in explaining to others what your system does and how it is implemented. It is of crucial importance when the original designers leave the project. If nothing has been written down, all their work has to be done again. Documentation can also be a source of knowledge for designing a new system.

1.3.5 Negotiation

When you as designer are making a system for somebody else, your customer, you have to agree with your customer on the requirements to the end product. The requirements documentation serves as the base document for negotiations and must be made with care.

1.4 Why reviews ?

Reviews are one of the most useful mechanisms in finding design flaws. The idea is that several people read and judge a document or a design and their comments help improve the design or even reveal errors.

1.5 Why archiving ?

In general, a project goes through a number of stages. In every stage, various things are produced like documentation and pieces of code. Software grows and changes as the project proceeds. In many cases, the software should be accessible to more than one designer. This calls for a central archiving system which gives reading permission to multiple developers simultaneously and writing permission for a module to only one designer at a time.

Another important reason for archiving is version control. A certain version of the software has specific features or bug-fixes. Sometimes it is necessary to revert to an older version. In such cases, older versions are taken from the archiving system, together with the proper documentation, tools etc. for that version.

2 The software life cycle

The software life cycle describes how a (software) project develops from the first ideas (the concept) into its eventual realisation. For every phase in this path, the life cycle model (Tansley and Hayball) indicates which documents are used to validate and conclude each phase.

The inclined arrows represent the time line. Starting from the top left, your project goes from a concept to the implementation, through several stages. Following the arrow from the bottom to the top right you'll come across several validation and test phases. The horizontal arrows represent the relation between stages at the same level.

Every phase in the project comes with its own type of documentation. After having formulated the initial ideas it's getting time to write down the demands to the system (requirements) in a requirements document, after that we write down in a design document how we will realise those requirements and when the system has been built is has to be tested to make sure it does what was originally intended, using test plans.

3 Minimal requirements for all documents

Every document must have at least the following items:

4 Document types

In the process of creating our system we use several types of documents. Every type reflects a certain phase in the development. Note that we're not talking about the difference between Word-, Excel- or Text-files. This distinction is made for a number of reasons. In the first place we document because everything that is not written down tends to disappear. You implement a clever trick in a program and two months later you've forgotten how you did it and you have to do the same job again, effectively re-inventing the wheel. The same applies to the system you built last year before you left the project. Your successor has a hard time understanding your work if nothing is written down. It may seem faster to make something without writing down what you will make and it is probably true when you want to try out the effect of typing
printf("Hello World\n");
but when your design gets a little more complicated than this, the time for writing documentation saves a lot of time that you would otherwise spend on correcting errors, re-inventing the wheel, explaining to others what you made etc.

The most basic way of documenting is just to write down everything that relates to the design. This can be ideas, requirements, drawings, pieces of code, e-mails or references to other resources. Also, when writing software, a good practice is to write considerations and explanations into the code as comment. Programmers often refer to the code when you ask for documentation. Even though the comments in the code are very useful, this is not sufficient for a project that consist of several subsystems.

The types of documents we will use are roughly categorised as follows:

4.1 Requirements document

A requirements document describes all demands to the system. For example, it states what the system will do, under what circumstances it must keep on functioning, how it reacts to external stimuli and how it works together with other subsystems. In some cases it can also be important to specify what the system will not do ! Apart from being a useful resource that explains what will be made, a requirements document also serves as a negotiation guideline for the parties involved: when the system is delivered, the person or department that ordered the system can check whether the system lives up to its expectations and the programmer can prove he/she did a good job.

4.2 Types of requirements

4.2 Design document

A description of the design is a translation of the requirements into the way the system will be built. In the design document you write down choices, considerations and descriptions of all parts of the system. This can be communication protocols, hardware platform, programming language, software tools and relevant information about the environment.

4.3 Implementation document

The actual implementation of a system is the collection of code, hardware, choice of programming language etc. The implementation document may contain any information relevant to the implementation, even fragments of code. The implementation document is very useful for debugging and maintenance purposes, so every minor detail has to be described in it.

4.4 Test plan

A test plan describes what has to be tested and how. Detailed tests are described, with all relevant circumstances and external stimuli. When other systems are necessary in a test run, these should also be mentioned. A test plan contains a number of tests that each serve to prove the correct functioning of a specific part of the system. For the person performing the tests it must be possible to write down a simple 'pass or fail' for every test in the test plan.

An example of a bad test is this: type in your name and see if anything happens.

An example of a good test is this: type in your name, followed by ENTER. Expected behaviour: the system responds by printing 'hello' on the next line of the screen.

The first test description leaves room for various answers, whereas second test description can be answered with 'pass' or 'fail'.

We can distinguish between module tests and integration tests. In a module test, a subsystem is tested on its own. Often, other subsystems that will be connected to it in the actual system will be 'stubbed', which means that simulations are used.

In an integration test, a module is tested in the actual system in which it will be operational. An integration test is the final test.

The advantage of a module test is that it's often easier to perform, more predictable and with simulated external systems it's often possible to create better test circumstances that really test the limits of your module.

4.5 Test report

A test report describes what has been tested and what the results were. There is also a section dedicated to conclusions.

5 From requirements to implementation: an example

The following is a simple example to demonstrate what kind of descriptions end up in what type of document.

Suppose we have to realise a database containing a number of names and addresses of people. In the requirements document we write down that the project is about a database that has to store names and addresses of people. It also states the maximum number of records and what happens when we reach that maximum number. Other requirements can be that the database has to be implemented on a Mac running OS-X, has a response time to queries of less than 1 second and has to be accessible night and day.

In the design document we might state that the database is implemented by a system of linked-lists for efficient updates. A hash-table is generated for searching. Alternatives are also given, together with the reasons why we chose for the linked-list system. We also write down calculations that prove or at least show why we think we can meet the 1-second response time requirement. Also a system of making back-ups without bringing down the system has to be described because the database has to be 'open' night and day.

The implementation document goes into details about the linked list. It describes what programming language we use, how the linked list is implemented. Fragments of code illustrate the way things work. We also have to describe how the hash-table is made and how this works together with the linked list.

6 Reviews

Reviews are one of the most useful mechanisms in finding design flaws. The idea is that several people read and judge a document or a design and their comments help improve the design or even point out errors. Reviews are usually done for documents but can also prove useful for code.

In general, a review is done by three or four people, preferably from different projects and with different backgrounds. Reviewing consists of several steps:

  1. Reading
  2. Giving comments, preferably in a review meeting
  3. Reworking
  4. Approving

In step 1, the author distributes the document that has to be reviewed amongst the reviewers. This can be done by e-mail or in hard copy. It is important that everyone gets the exact same copy and that this copy is also used in the review meeting. The reviewers read through the document and write down all errors (even spelling errors !), things they consider unclear, omissions etc.

In step 2, all comments are collected. The preferred way is to organise a 'review meeting' in which all reviewers and the author get together. Page after page, every reviewer gets the opportunity to mention his/her findings and the author makes notes about these. Normally, a session of no more than 2 hours should be sufficient. Discussing all details can better be done in a different meeting because of the nature of a typical review group: not everything is relevant for everyone.

Step 3, reworking, implies that the author makes a new version of the document with all findings from the review.

After reworking the document, the author sends it to the reviewers again. When all reviewers agree with the new version, the document is approved and the document status is updated.

7 Archiving and version control

Archiving is a way to store your knowledge and be able to retrieve it. This requires an archiving system that helps you store all relevant information in such a way that you can retrieve it from the system much later. In general we have to define the structure for our archive ourselves.

This also requires that the users of the archiving system put their information into the system according to the structure that has been chosen for the archive. The bottom line, of course, is that the users put information into the archive at all.

Another purpose of an archiving system is version control. This is the process of giving version numbers or names to specific releases of your product. With a well set-up archiving system it is possible to retrieve older versions when needed, e.g. for maintenance purposes.

7.1 Archive structure

To set up our archive, we have to agree upon a structure. A starting point can be the following list of directories for every subsystem:

A detailed description of the archiving and version control is found in the manual "Archiving and version control" by Marc Groenewegen. The CVS version control system is described in more detail and with working examples in the manual "CVS introduction" by Marc Groenewegen.

8 Doggie bag

8.1 Document list

This will be a file (text, spreadsheet or such) containing a list of all documents within your project, with corresponding info:

8.2 Document format

I want to propose using XML as the markup language for documentation. This has several advantages: it is portable across platforms (MS, Mac, UNIX), it can be viewed with a web browser, thus easily creating web content, it generates a table of contents as well as chapter and paragraph headers, it is plain text and therefore can be stored efficiently in an archiving system like CVS and it can be used to generate other formats such as TeX or PostScript for better printout from the same XML source.

8.2.1 xmldoc and xsltproc for various platforms

XML www.w3.org/XML
xmldoc www.andromeda.nl/projects/xmldoc
xsltproc for MacOs www.access.ch/ml/software/java

8.3 Document templates

Several templates are available that are ready to be used for requirements, design, implementation and test documents. Just replace the descriptive text and use the structure of the templates to your advantage. If the structure doesn't fit your description, feel free to modify it.

8.4 Done lists

As mentioned before, the first level of documenting is just to write down everything related to the work at hand. A good habit to help you in this process is to keep a record of what you've done every day, or even write thoughts and details into this record as you're working with them. To keep things managable, start a new plain text file for every new month. Why plain text ? Because it's easy to search in. Why every month ? Because if you start a new file every day you'll have too many files in one directory before you know it.

Thus, the proposed structure is as follows:

When writing system documentation, these files often are a good starting point because a lot of information is already in there.

8.5 On-line help

Manual pages. Great ! But I won't go into details right now ;-)