Several technology companies are considering using DNA for archival storage purposes.
DNA as a data storage medium – the concept is almost intuitive. If nature entrusts the biopolymer with the information needed to build an organism, why couldn’t it store our digital data? DNA uses four different molecules to encode information, which easily fits into a binary system (see box). So it’s perhaps no surprise that the idea was already put forward decades ago. Physicist Richard Feynman proposed it in his famous 1959 lecture “There’s Plenty of Room at the Bottom,” just six years after the helical structure of DNA was discovered.
Storing data in DNA would have major advantages. As small as a flash memory cell is, a single molecule as a basic storage unit is about as small as it gets. Theoretically capable of storing 455 exabytes per gram, a soda can filled with DNA could store all the data in the world. At the current growth rate of data generation, such high-density solutions will soon become attractive, if not unavoidable, according to proponents.
Another strength of DNA is its great stability when properly stored. Like most biomolecules, without protection, DNA is fragile and subject to degradation. But the fact that fully intact DNA has been extracted from ancient fossils proves that there are ways to preserve it for thousands of years, if not indefinitely.
By comparison, flash devices typically have data retention lives of around 10 years, while magnetic tape – still the go-to solution for long-term digital storage – has a lifespan of 10-20 years. The robustness of DNA storage solutions is an especially attractive feature for organizations that need to store large amounts of data that is rarely accessed, such as movie studios and national archives.
Add to that a low environmental impact in terms of energy and raw material requirements, and you can see why companies would be interested in using DNA for digital data storage. And they are.
Slow and expensive
Such demonstrations are obviously still far from practical applications; even today, DNA sequencing and synthesis techniques are too slow and expensive. Nevertheless, data-intensive companies are now taking this concept very seriously. Members of the DNA Data Storage Alliance (DDSA), founded in 2020, include software maker Microsoft, hard drive makers Western Digital and Seagate, and ICT companies Dell, IBM and Lenovo.
Anticipating further reductions in synthesis and sequencing costs, DDSA argues that DNA data storage will become a competitive solution. Part of the savings, in the case of archival storage, would come from eliminating the need to have to regularly transfer data to new storage media to avoid data loss due to hardware deterioration. Another potential cost-saving feature would be the ability to make massive amounts of copies in parallel once the data has been encoded into DNA strands.
The Intelligence Advanced Research Project Activity (IARPA), the R&D arm of US intelligence agencies, runs a program to write 1 terabyte and read 10 terabytes in 24 hours for $1,000. Just to get an idea of the current state of the field, the Georgia Tech Research Institute holds the current record for writing at 200 megabytes per day, although researchers recently claimed they could speed this up to 100 times.