Publish or Perish: Data Storage and Civilization


Who do you think of when you think of ancient civilizations? Romans? The Greeks? Chinese? India? Egyptians? What about the Scythians, Muisca, Gana or Kerma? You might not recognize this second group so easily because they didn’t all have a writing system. The same is true, to a lesser extent, for the Etruscans, the Minoans or the inhabitants of Easter Island where they wrote, but no one remembers how to read their writing. Even the Egyptians were mysterious until the discovery of the Rosetta Stone. We imagine that an author writing in Etruscan didn’t think anyone would be able to read the scripture in the future – he probably thought he was recording his thoughts for all eternity. Hubris? Maybe, but what about our documents that are increasingly stored as bits somewhere?

It was bad enough when you had punch cards and magnetic media. We are sure that some tape formats are no longer convenient to read. Could you read a magnetic bubble cartridge? Would it even be viable after all these years? But the problem is even worse now. Where are your old copies of Hackaday? Where are your emails? “In the cloud” is cliché, but appropriate. In 1,000 years, there won’t be a Google server, and whatever storage media it uses today will probably be dust, even if people wanting to read it knew how.

Do you know the function of it? (Public domain; from the Walters Aret Museum)

And it gets worse. If you see a rock or scroll with squiggles on it, you can assume it’s writing. What if you saw ropes with knots? The Incas used a system like this to record things. We still don’t know exactly how to read them. What will a future archaeologist think of a flash card or a hard disk? They’re as unlikely to use something like that as we are to use a strigil – the Roman knife used to cleanse ourselves. If you saw one without context, you might assume it was a carpentry tool, not a bathroom fixture. Why would our future archaeologists think that certain little boxes could contain writing if you knew how to read them?

Old Media vs Modern Media

At least some of the older media have a chance of surviving. Punched cards and paper strips are probably as sturdy as books. Like a stone tablet, it should be fairly obvious that they contain data and are easy to decode, even by hand.

Magnetic things are less certain, however. Band-based oxides won’t last forever, and the magnetic information they contain is even more fragile. Optical media can last, but it’s far from certain that you’d realize there was any encoded data. They could be confused with art. The tape has the same problem. It would be easy to imagine a future museum showing duct tape used for an unknown religious ritual involving shrines with raised floors.

Modern media is likely to be flash-based and it certainly won’t last forever. It’s even harder to realize that there might be something about them. Even now I can see half a dozen USB devices on my desk, half of which aren’t flash drives but don’t look much different.

Then there is all the cloud data. Of course, it really is stored somewhere on a hard drive (magnetic or flash media). Presumably, if future archaeologists found a buried data center somewhere, they could unlock tons of data, but only if they realized what it was and how to read it.

Encoding issues

Even today, it can be difficult to read a disc written on a system if you don’t have that system. It got a little easier, in some common cases, because a few formats are nearly universal, but there are always outliers.

As a thought experiment, imagine that you are a future archaeologist studying 21st century ruins. Your assistant brings you a small black rectangle the size of your thumbnail marked “32 GB, class 10”. First, you need to realize that this is a flash device. Next, you’ll need to figure out how to power it up and send it the right commands over the serial bus to get data out of it.

But the fun is just beginning. Along with the data, you will need to determine the file system format. Then you can dig into the different file types, each of which is a science project in itself. PDF files? Images and video? Good luck. Imagine if the Egyptians used a different set of hieroglyphs for different purposes and then subjected them to data compression to minimize redundancy.

Real life

We’re not the only ones thinking about it. The University of Göttingen, for example, manages 5 petabytes of data in an “eternal” archive collected over the past 40 or so years. They claim that the tapes they use have a lifespan of 20-30 years, but the technology to manage them only lasts 10 years. So they are constantly moving data from one medium to another, which takes about two years. Of course, if they stop working, you can assume that in 300 or 400 years there won’t be much chance of recovering the data.

There’s no shortage of services to store your data “forever” in the cloud, but it’s hard to see how they can really guarantee this and what it would mean if it didn’t work. For instance, Ardrive uses the “blockweave” to store data in a distributed fashion, but it’s easy to imagine a number of ways this could be disrupted. As Adam Farquhar, Head of Digital Preservation at the British Library, has said, “If we’re not careful, we’ll know more about the early 20th century than the early 21st.”

Not that paper records are much better. The paper deteriorates. Languages ​​are lost. The famous Library of Alexandria burned down. But the stone seems to last. Ironically, we know a lot about Akhenaten, King Tut’s father, because the Egyptians tried to erase him from history by destroying his handiwork. They reused the stones, often as a foundation for new constructions and so we found much of it well preserved.

As we push into more exotic storage media, the problem only gets worse. We read about storage glass data (see video below) and molecular storage at 80K using liquid nitrogen. None of this will be more obvious or viable than what we use today. In fact, many of them will make the problem worse.

We can’t tell how serious they are, but the “Billion-year-old archiveThe project sent a quartz disk with Isaac Asimov’s Foundation trilogy to the glove compartment of Elon Musk’s space Tesla. They also apparently sent a library to the moon in 2019. However, these libraries use DNA storage which seems odd as we struggle to recover ancient DNA today and also burning tiny texts in thin nickel films. On top of that, the probe he was hitchhiking in has crashed, and the library’s survival is in question.

It’s hard, however, to visualize our post-apocalyptic archaeologist wandering the moon and realizing the importance of a sheet of metal and a few crystals. This brings us to two interesting questions: First, how could you store data that is obvious for the distant future in such a way that it survives and is understandable? The question is a bit like extraterrestrial messages where it is difficult to understand what another being could decode. Without this answer, we could one day become another mysterious “lost civilization”.

The second question is: what if this had happened before? It smacks of crazy science, but what if an ancient artifact contains encoded information and we don’t even recognize it? Sure, we recognize some of them, but we don’t know what to make of them like the Inca knots in the video below. Do you have an answer to any of these questions? Leave them in the comments.

[Bannerimage:”[Bannerimage:“[Imagedelabannière :”[Bannerimage:“Egyptian hieroglyphsby Martie Swart


About Author

Comments are closed.