16 May, 2011 - 18:34 By Dr Ian Pearson of Futurizon

Look after your data, or lose it

The BBC has managed to recover much of the data from their Domesday project in 1986 and put it on the web. Back then, a mere 25 years ago, people were asked to document some aspects of their area and lives, and it was all stored on laserdiscs - we didn't have the web then so it seemed a good idea at the time.

The BBC has usually been pretty good on technology awareness, so it wasn't that they lacked understanding, it is just that technology comes and goes and it isn't always obvious which will survive. At the time, laserdiscs were thought to be a modern storage technique, a large format CD basically, aimed mainly at video storage.


Sadly, the format failed soon after, so few people ever managed to see the results of the work. Then the players died and it was feared the data would be lost. It was then a big task to try to recover data and store it on the web. So it is on website now, presumed safe forever, because we'll always have the web. Nope!

The web is a good solution for easy access for most people today, but we won't always use the web, it will fade into history one day. And the servers it uses will die and need to be replaced. The data can easily be copied, but what happens when the software used for photos and videos and text has been through so many updates that the old software won't work on the latest operating systems any more?

If data isn't continuously migrated to new formats, eventually it gets lost. I still have every file I produced up to 1987 on magnetic tape, but that isn't much use to me without access to a tape reader. I still have every email I have ever sent, but it is a long time since I could open the ones that use Microsoft Mail, or the pictures I produced with MacDraw.

Microsoft and Apple still thrive, but files made with their old software don't open on today's computers with today's software. With effort, it might be possible to trawl the net for adapter software that can read the files and allow them to be opened once again, (technology ideas sometimes resurface so I occasionally want to resurrect my work from the last time) but it can be very time consuming and tedious even where it is possible, and the value is usually low, so I rarely bother, and most old files are effectively lost forever. I may redraw a diagram from memory, but my memory won't last forever either.

Some paper from thousands of years ago can still be read, albeit often torn and degraded. Most will long since have decomposed and been lost. But some remains. Paper is a superb technology. If it were only invented tomorrow, it would be hailed as one of the biggest breakthroughs of the century. It can be scribbled on in many ways, crumpled up, mistreated, and still be capable of being read easily hundreds of years later without the need for any special equipment.

The trouble with paper is it only works for static images, not for video, audio, or many other data formats. In future we will want to store sensations and emotion too. And that possibility immediately highlights the core of the problem. How do you encode something in a way that others can understand it? It is easy with pictures, not so easy with video and even less so with feelings. So we need internationally agreed standards for storing stuff.

Easy enough, but new ones keep coming, when they have to adapt to new needs. New operating systems, improving telecoms or storage technology, better compression or display techniques - all of these cause new formats to be created.

Backwards compatibility is usually imposed, but eventually standards get so far out of use that the demand to retain compatibility evaporates, and the data is essentially lost. People don't care at the time, but later, when they want to go back and look at very old stuff, they find they no longer can. This problem will persist and grow while new technologies continue to be invented.

Now that we are well aware of the dangers of data loss, it ought to be just a matter of discipline and making sure that data is continually migrated safely to each new generation of technology while it still can. But that is much easier to say than to do. Human systems are also a problem. It is essential that migration policies are properly handed over as people move on to other jobs and that new people know what is to be managed.

And even if all the data is migrated, it is one thing having data and quite another knowing that you have it, where to find it, what it is, why you have it, and what the context of that data is, e.g which corporate systems was it intended to be used for. So it is almost as important to make sure that full cultural integrity is retained as well as actual data. There are so many ways information can be lost.

Some kinds of storage that we use today will undoubtedly become obsolete soon? We can't be certain which of course, but blu-ray is an obvious candidate to start with, even if it did outlive HD-DVD. It is already easy to download movies off the net, and internet TV is starting to take off properly. In addition, memory sticks are getting cheaper and higher capacity, and most new TVs have USB ports, so it will become increasingly attractive to ship movies that way even in shops. They will offer much more compactness and portability than DVDs or blu-ray disks. So however popular it is right now, its life expectancy isn't all that great.

Any kind of future storage, such as holographic storage in crystals, or different shapes and sizes of memory stick, or mobile phones, may come and go too. We should never assume any particular solution will last forever. In fact, in almost all cases, it is a very safe bet that it won't be here in 30 years. Paper is one of the few exceptions.

The cloud is all the fashion now too, for very good reason, and may be for many years. But ultimately it comes down to server farms, and these are owned by companies or consortia, and companies can go bust, or business models change, so again there is a danger of losing data unless appropriate strategies are used for distributing and maintaining it. Over years and eventually decades, whole areas of business practices fall by the wayside and data could easily be lost as they do.

And even the web won't be here forever. The web is one particular way of using the internet, and that is one particular way of using networks to connect stuff. In a few years, tiny chips will replace many existing devices, and digital jewellery will become very common. The world will be cluttered with zillions of tiny devices, communicating using short range radio directly with one another.

In that kind of world, the internet is not necessarily a good solution. Other kinds of network would be more appropriate, so if the cloud still exists, it will work very differently. So the lesson is clear. Making sure you don't lose data is an ongoing activity. You can't relax and assume it is taken care of, because all the solutions we use will eventually stop being accessible, or the procedures will be forgotten.

Most of the time the data won't be useful or valuable any more, but sometimes it will be. There are lots of ways of losing it, but when it is lost it is hard to get it back again and will rarely be worth all the effort it will need. And that applies whether it is video of your kids on holiday, or your corporate history.


Newsletter Subscription

Stay informed of the latest news and features