Disappearing knowledge

I saw an interesting article on Slashdot recently about the vanishing of online scientific journals.  The short version is that some people looked at online open-access academic journals and found that, over the last decade or so, a whole bunch of them have essentially disappeared.  Presumably the organizations running them either went out of business or just decided to discontinue them.  And nobody backed them up.

In case it's not already obvious, this is a bad thing.  Academic journals are supposed to be where we publish new advances in human knowledge and understanding.  Of course, not every journal article is a leap forward for human kind.  In fact, the majority of them are either tedious crap that nobody cares about, of questionable research quality, or otherwise not really that great.  And since we're talking about open-access journals, rather than top-tier ones like Nature, lower-quality work is probably over-represented in those journals.  So in reality, this is probably not a tragedy for the accumulated wisdom of mankind.  But still, there might have been some good stuff in there that was lost, so it's not good.

To me, this underscores just how transient our digital world is.  We talk about how nothing is ever really deleted from the internet, but that's not even remotely true.  Sure, things that go viral and are copied everywhere will live for a very long time, but an awful lot of content is really just published in one place.  If you're lucky, it might get backed up by the Internet Archive or Google's cache, but for the most part, if that publisher goes away, the content is just gone.

For some content, this is a real tragedy.  Fundamentally, content on the Internet isn't that different from offline content.  Whether it's published on a blog or in a printed magazine, a good article is still a good article.  A touching personal story is no more touching for being recorded on vinyl as opposed to existing as an MP3 file.  I know there's a lot of garbage on the web, but there's also a lot of stuff that has genuine value and meaning to people, and a lot of it is not the super-popular things that get copied everywhere.  It seems a shame for it to just vanish without a trace after a few short years.

I sometimes wonder what anthropologists 5000 years from now will find of our civilization.  We already know that good quality paper can last for centuries.  How long will our digital records last?  And if the media lasts 5000 years, what about the data it contains?  Will anthropologists actually be able to access it?  Or are they going to have to reverse-engineer our current filesystems, document, and media formats?  Maybe in 5000 years figuring out the MPEG-4 fomat from a binary blob on an optical disk will be child's play to the average social science major, who knows?  Or maybe the only thing they'll end up with the archival-quality print media from our libraries.  But then again, given what the social media landscape looks like, maybe that's just as well....

You can reply to this entry by leaving a comment below. This entry accepts Pingbacks from other blogs. You can follow comments on this entry by subscribing to the RSS feed.

Add your comments #

A comment body is required. No HTML code allowed. URLs starting with http:// or ftp:// will be automatically converted to hyperlinks.