Ephemeral Knowledge

Just a few notes on the distribution of knowledge and information in this day and age, inspired by a couple of articles (Slate and Instapundit) I read recently.

The cliche is that there is now information swimming around libraries and databases than at any time in history. There is, however, more information being lost to the void than at any time in history. Quite possibly, hundreds years from now, the turn of the millennium may be regarded as a dark age of information and records, since very little documentation may survive till then.

Some examples: Slate has an article on the ephemerality of email, and how it will hurt historical research. There are copious documents detailing the decision making processes of World War 2, Vietnam, and so on. There are very few pertaining to the first Gulf War, and perhaps, in a few years, even fewer pertaining to the second Gulf War. Why? The transition from paper to email, and carelessness in saving the email. There’s a lot more email than paper, most of which consists of off-the-cuff thoughts on policy. The sender and recipient of the email may not think much of this correspondence, and won’t take much effort to save it. A historian trying to reconstruct Wolfowitz’s thought process in January 2003, however, may care about every little piece of email that gets sent to, say, Rumsfeld. It’s not clear if these emails will be saved. Similarly, there are no typewritten policy briefs anymore that get typed up by the typing pool (carcon copies and all) and sent out to various departments for comments. Instead, policy briefs are present in PowerPoint, usually with few archive copies. On a similar note and in particular example, one of the keys to modern military theory, the OODA loop proposed by John Boyd, was never put down on paper or into a book, like Clausewitz or Sun Tzu (even though literacy and printing have expanding dramatically), but exists only as a series of presentations remembered by audience.

This applies not only to newly generated knowledge and information. The transition from paper to electronic media may be painful, and may entail the loss of knowledge. In particular, we have Nicholson Baker’s argument about card catalogs being replaced by computerized catalogs. This may seem to be an unalloyed good: books can be looked up more quickly and efficiently. There is, however, information being lost: the penned notes of generations of librarians. There was also the story of a Norweigan museum, where the director died, and took with him the one password to his files on the archives. This sort of knowledge might be maintained, but it requires concentrated effort and resources that no one seems to want to devote to posterity.

And we won’t get into the issue of information being saved onto obsolete formats and decaying media.

On the optimistic, positive side, we have Glenn Reynolds article in Tech Central Station on “horizontal knowledge” and the advent of search engines, and how this would have defied predictions ten years ago. There are a few implications to something like Google: information may become very distributed, and therefore more difficult to lose. Information that’s widely distributed can now be searched. Information may be better preserved because a company now has financial incentive to maintain a vast library of general information and trivia (There was an old joke about doing backups to Google: encode your data, and post them to Usenet in a retrievable fashion. Google will go through the trouble of making backups of your stuff). The problem with this is that most of the contributors of this horizontal knowledge are hobbyists, not professionals, and what gets posted may not be the interesting details. Or it could be plainly wrong.

Widely disseminated information may also be unexpectedly persistent, even though it’s not immediately apparent that this information has been widely distributed. Enron and Arthur Anderson learned this leason: email may be deleted from your installation Outlook, but you have to delete it on the server, the server’s backup tapes, the recipients of the email’s Outlook, the recipient’s server, etc. And sufficiently persistent investigators will unearth these documents. Similarly, embarrassing photos posted on the Net may persist forever. This perhaps says more about distributed storage than anything else. If the email were sensitive, which I’m sure government email may be, then it may be limited to a small circle of recipients, and more likely to be inadvertently lost.

So, what do we do about the last case, where email may not be widely distributed, and only looked at decades later by historians? Are there technological solutions? For email, for organizations where archives are important, all email can be archived off at the SMTP server. This is trivial to do, but there’s a large cost in maintaining these archives, that usually isn’t obvious from the start. These archives will also have to be available in some readable format for future generations. Beyond that, I’m not sure.

Comments are closed.