Documenting and archiving
Marc Groenewegen (marcg@dinkum.nl)
September 14, 2005
Document Information |
Organisation |
Hogeschool voor de Kunsten Utrecht (HKU)
|
Version |
0.3 |
Status |
Proposal |
Abstract:
This document provides an introduction into a way of working that
focuses on software design and knowledge preservation for a team
of developers.
Table Of Contents
1 Introduction
Documenting and archiving are a must for good software design. The
following chapters describe what we need in a mature software project for
describing and maintaining what we make.
1.1 Disclaimer notice
All contents of this document originate from the author. There is however
a fair chance that some content resembles commonly used configuration
management knowledge, such as found within CMM and IEEE. One thing I've
bluntly taken from Tansley and Hayball is their software life cycle
diagram because it elaborates the process so clearly.
1.2 Intended audience
This document is primarily intended to be used by system architects,
developers and project managers. At HKU it serves as reading and reference
material supporting a documenting course. The intention of the author is
that the document will also be useful without additional explanation.
1.3 Why documenting ?
Documenting serves a number of purposes. The most obvious reason for
writing down what you make is not to forget the details. But there is
more. Consider the following:
- Design integrity and saving reworking costs
- Reusability
- Maintainability
- Information exchange
- Negotiation
1.3.1 Design integrity
Finding design flaws early. This is one of the main advantages of writing
down what you are going to make. Especially when your design has been
reviewed by others, there is a good chance that you'll find design errors
before you started implementing them.
Documenting all relevant requirements, details,
considerations, decisions and relationships between subsystems helps in
finding potential design flaws before they are implemented. The time you
spend on finding and correcting errors increases exponentially with time.
The later you find an error, the more time it takes to redesign your system
and other systems to correct the error. Therefore it is very important to
find design errors as soon as possible. Describing your project before you
start making it and asking people to judge your design will certainly help.
1.3.2 Reusability
A piece of code that works for one purpose might be useful for a different
purpose as well, but when you don't know all details about what it does
under all circumstances, what its inputs and outputs are and how it works,
it's no good. A piece of documentation describing such details will
help you in deciding whether to use the product or e.g. make it yourself.
1.3.3 Maintainability
After your design has been finished and taken into production it's very
likely that changes have to be made to it. In many cases, extensions or
functional changes are desired and sometimes changes have to be made
because of failures or errors.
Without documentation it is sometimes impossible to understand all
consequences of a change. When correcting errors, documentation is often
necessary to provide details about the system or the relations between
different subsystems.
1.3.4 Information exchange
Documentation helps in explaining to others what your system does and
how it is implemented. It is of crucial importance when the original
designers leave the project. If nothing has been written down, all their
work has to be done again.
Documentation can also be a source of knowledge for designing
a new system.
1.3.5 Negotiation
When you as designer are making a system for somebody else, your customer,
you have to agree with your customer on the requirements to the
end product. The requirements documentation serves as the base document
for negotiations and must be made with care.
1.4 Why reviews ?
Reviews are one of the most useful mechanisms in finding design flaws.
The idea is that several people read and judge a document or a design
and their comments help improve the design or even reveal errors.
1.5 Why archiving ?
In general, a project goes through a number of stages. In every stage,
various things are produced like documentation and pieces of code.
Software grows and changes as the project proceeds. In many cases, the
software should be accessible to more than one designer. This calls for
a central archiving system which gives reading permission to multiple
developers simultaneously and writing permission for a module to only
one designer at a time.
Another important reason for archiving is version control. A certain version
of the software has specific features or bug-fixes. Sometimes it is necessary
to revert to an older version. In such cases, older versions are taken from
the archiving system, together with the proper documentation, tools etc.
for that version.
2 The software life cycle
The software life cycle describes how a (software) project develops from
the first ideas (the concept) into its eventual realisation. For every
phase in this path, the life cycle model (Tansley and Hayball) indicates
which documents are used to validate and conclude each phase.
The inclined arrows represent the time line. Starting from the top left,
your project goes from a concept to the implementation, through several
stages. Following the arrow from the bottom to the top right you'll
come across several validation and test phases. The horizontal arrows
represent the relation between stages at the same level.
Every phase in the project comes with its own type of documentation. After
having formulated the initial ideas it's getting time to write down the
demands to the system (requirements) in a requirements document,
after that we write down in a design document how we will realise those
requirements and when the system has been built is has to be tested to make
sure it does what was originally intended, using test plans.
3 Minimal requirements for all documents
Every document must have at least the following items:
- Title page listing the following:
- Title
- Author's name
- Organisation
- Project name and if appropriate the subsystem name
- Date
- Version or revision number
- Status: draft/proposal/final
- Abstract
- Target audience and mailing list
- (Location or file name in the archive)
4 Document types
In the process of creating our system we use several types of documents.
Every type reflects a certain phase in the development. Note that we're
not talking about the difference between Word-, Excel- or Text-files.
This distinction is made for a number of reasons. In the first place we
document
because everything that is not written down tends to disappear. You implement
a clever trick in a program and two months later you've forgotten how you
did it and you have to do the same job again, effectively re-inventing the
wheel.
The same applies to the system you built last year before you left the
project. Your successor has a hard time understanding your work if nothing
is written down.
It may seem faster to make something without writing down what you will
make and it is probably true when you want to try out the effect of
typing printf("Hello World\n");
but when your design gets
a little more complicated than this, the time for writing documentation
saves a lot of time that you would otherwise spend on correcting errors,
re-inventing the wheel, explaining to others what you made etc.
The most basic way of documenting is just to write down everything that
relates to the design. This can be ideas, requirements, drawings, pieces
of code, e-mails or references to other resources.
Also, when writing software, a good practice is to write considerations
and explanations into the code as comment. Programmers
often refer to the code when you ask for documentation. Even though the
comments in the code are very useful, this is not sufficient for a
project that consist of several subsystems.
The types of documents we will use are roughly categorised as follows:
- Requirements
- Design
- Implementation
- Test plan
- Test report
4.1 Requirements document
A requirements document describes all demands to the system. For example,
it states what the system will do, under what circumstances it must keep on
functioning, how it reacts to external stimuli and how it works together
with other subsystems. In some cases it can also be important to specify
what the system will not do !
Apart from being a useful resource that explains what will be made, a
requirements document also serves as a negotiation guideline for the
parties involved: when the system is delivered, the person or department
that ordered the system can check whether the system lives up to its
expectations and the programmer can prove he/she did a good job.
4.2 Types of requirements
- Functional requirements
- Restrictions
- Relations to other systems
- Communication protocols
- Time frame
- What our design does NOT do
- Risks and dependencies
4.2 Design document
A description of the design is a translation of the
requirements into the way the system will be built. In the design document
you write down choices, considerations and descriptions of all parts of the
system. This can be communication protocols, hardware platform, programming
language, software tools and relevant information about the environment.
4.3 Implementation document
The actual implementation of a system is the collection of code, hardware,
choice of programming language etc. The implementation document may contain
any information relevant to the implementation, even fragments of code.
The implementation document is very useful for debugging and maintenance
purposes, so every minor detail has to be described in it.
4.4 Test plan
A test plan describes what has to be tested and how. Detailed tests are
described, with all relevant circumstances and external stimuli.
When other systems are necessary in a test run, these should also be mentioned.
A test plan contains a number of tests that each serve to prove the correct
functioning of a specific part of the system. For the person performing the
tests it must be possible to write down a simple 'pass or fail' for
every test in the test plan.
An example of a bad test is this: type in your name and see if anything
happens.
An example of a good test is this: type in your name, followed by ENTER.
Expected behaviour: the system responds by printing 'hello' on the next
line of the screen.
The first test description leaves room for various answers, whereas second
test description can be answered with 'pass' or 'fail'.
We can distinguish between module tests and integration tests. In a module
test, a subsystem is tested on its own. Often, other subsystems that will
be connected to it in the actual system will be 'stubbed', which means that
simulations are used.
In an integration test, a module is tested in the actual system in which
it will be operational. An integration test is the final test.
The advantage of a module test is that it's often easier to perform,
more predictable and with simulated external systems it's often possible
to create better test circumstances that really test the limits of your
module.
4.5 Test report
A test report describes what has been tested and what the results were.
There is also a section dedicated to conclusions.
5 From requirements to implementation: an example
The following is a simple example to demonstrate what kind of descriptions
end up in what type of document.
Suppose we have to realise a database containing a number of names and
addresses of people.
In the requirements document we write down that the project
is about a
database that has to store names and addresses of people. It also states
the maximum number of records and what happens when we reach that maximum
number. Other requirements can be that the database has to be implemented
on a Mac running OS-X, has a response time to queries of less than 1 second
and has to be accessible night and day.
In the design document we might state that the database is
implemented
by a system of linked-lists for efficient updates. A hash-table is
generated for searching. Alternatives are also given, together with the
reasons why we chose for the linked-list system.
We also write down calculations that prove or at least show why we think
we can meet the 1-second response time requirement. Also a system of making
back-ups without bringing down the system has to be described because the
database has to be 'open' night and day.
The implementation document goes into details about the
linked list. It
describes what programming language we use, how the linked list is
implemented. Fragments of code illustrate the way things work.
We also have to describe how the hash-table is made and how this works
together with the linked list.
6 Reviews
Reviews are one of the most useful mechanisms in finding design flaws.
The idea is that several people read and judge a document or a design
and their comments help improve the design or even point out errors.
Reviews are usually done for documents but can also prove useful for code.
In general, a review is done by three or four people, preferably from
different projects and with different backgrounds. Reviewing consists of
several steps:
- Reading
- Giving comments, preferably in a review meeting
- Reworking
- Approving
In step 1, the author distributes the document that has to be reviewed
amongst the reviewers. This can be done by e-mail or in hard copy. It
is important that everyone gets the exact same copy and that this copy
is also used in the review meeting.
The reviewers read through the document and write down all errors
(even spelling errors !), things they consider unclear, omissions etc.
In step 2, all comments are collected. The preferred way is to organise
a 'review meeting' in which all reviewers and the author get together.
Page after page, every reviewer gets the opportunity to mention his/her
findings and the author makes notes about these. Normally, a session of
no more than 2 hours should be sufficient. Discussing all details can
better be done in a different meeting because of the nature of a typical
review group: not everything is relevant for everyone.
Step 3, reworking, implies that the author makes a new version of the
document with all findings from the review.
After reworking the document, the author sends it to the reviewers again.
When all reviewers agree with the new version, the document is approved
and the document status is updated.
7 Archiving and version control
Archiving is a way to store your knowledge and be able to retrieve it.
This requires an archiving system that helps you store all relevant
information in such a way that you can retrieve it from the system
much later. In general we have to define the structure for our archive
ourselves.
This also requires that the users of the archiving system put their
information into the system according to the structure that has been
chosen for the archive. The bottom line, of course, is that the users
put information into the archive at all.
Another purpose of an archiving system is version control.
This is the
process of giving version numbers or names to specific releases of your
product. With a well set-up archiving system it is possible to retrieve
older versions when needed, e.g. for maintenance purposes.
7.1 Archive structure
To set up our archive, we have to agree upon a structure. A starting point
can be the following list of directories for every subsystem:
- doc - subsystem documentation
- src - subsystem source code
- include - subsystem header files
- config - subsystem configuration scripts
- bin - subsystem binaries
A detailed description of the archiving and version control is found in
the manual "Archiving and version control" by Marc Groenewegen.
The CVS version control system is described in more detail and with working
examples in the manual "CVS introduction" by Marc Groenewegen.
8 Doggie bag
8.1 Document list
This will be a file (text, spreadsheet or such) containing a list of all
documents within your project, with corresponding info:
- Title
- Author's name
- Organisation
- Project name and if appropriate the subsystem name
- Date
- Version or revision number
- Status: draft/proposal/final
- Location or file name in the archive
8.2 Document format
I want to propose using XML as the markup language for documentation. This
has several advantages: it is portable across platforms (MS, Mac, UNIX),
it can be viewed with a web browser, thus easily creating web content,
it generates a table of contents as well as chapter and paragraph headers,
it is plain text and therefore can be stored efficiently in an archiving
system like CVS and it can be used to generate other formats such as
TeX or PostScript for better printout from the same XML source.
8.2.1 xmldoc and xsltproc for various platforms
8.3 Document templates
Several templates are available that are ready to be used for requirements,
design, implementation and test documents. Just replace the descriptive text
and use the structure of the templates to your advantage. If the structure
doesn't fit your description, feel free to modify it.
8.4 Done lists
As mentioned before, the first level of documenting is just to write down
everything related to the work at hand. A good habit to help you in this
process is to keep a record of what you've done every day, or even write
thoughts and details into this record as you're working with them. To keep
things managable, start a new plain text file for every new month.
Why plain text ? Because it's easy to search in. Why every month ? Because
if you start a new file every day you'll have too many files in one directory
before you know it.
Thus, the proposed structure is as follows:
When writing system documentation, these files often are a good starting point
because a lot of information is already in there.
8.5 On-line help
Manual pages. Great ! But I won't go into details right now ;-)