Scientists Store on DNA an Operating System, a Movie and a Computer Virus

by Swati Khandelwal

March 03, 2017

Swati Khandelwal
Technical Writer, Security Blogger and IT Analyst. She is a Technology Enthusiast with a keen eye on the Cyberspace and other tech related developments.

Did you know that 1 Gram of DNA

can Store 1,000,000,000 Terabyte of Data

for 1000+ Years?

Just last year, Microsoft purchased 10 Million strands of synthetic DNA from San Francisco DNA synthesis startup called Twist Bioscience and collaborated with researchers from the University of Washington to focus on using DNA as a data storage medium.

However, in the latest experiments, a pair of researchers from Columbia University and the New York Genome Center (NYGC) have come up with a new technique to store massive amounts of data on DNA, and the results are marvelous.

The duo successfully stored around 2mb in data, encoding a total number of six files, which include:

A full computer operating system

An 1895 French movie "Arrival of a Train at La Ciotat"

A $50 Amazon gift card

A computer virus

A Pioneer plaque

A 1948 study by information theorist Claude Shannon

The new research (DNA Fountain Enables a Robust and Efficient Storage Architecture), which comes courtesy of Yaniv Erlich and Dina Zielinski, has been published in the journal Science.

But How Did the Researchers Store Digital Data on DNA?

Movie Stored and Retrieved

from DNA Molecules

A copy of this 1895 French film, "Arrival of a train at La Ciotat," was encoded into synthetic DNA molecules and later retrieved using a new coding strategy developed by Yaniv Erlich and Dina Zielinski at Columbia University and New York Genome Center.

Calling their process a "DNA Fountain," the researchers first compressed all the data into a single master archive and split it into short strings of binary digits, made up of ones and zeros.

Next, the duo used an "erasure-correcting algorithm called fountain codes" to randomly packaged the strings into droplets. Each droplet contains a barcode in the sequence that helped the researchers reassembling the file.

The researchers then,

"mapped the ones and zeros in each droplet to the four nucleotide bases in DNA: A, G, C and T,"

...and ended up with a digital list of 72,000 DNA strands that contained the encoded data.

This code was then sent in a text file to Twist Biosciences, the same DNA synthesis startup from which Microsoft purchased 10 Million strands of synthetic DNA last year, that then turned that digital information into biological DNA.

"Two weeks later, they received a vial holding a speck of DNA molecules.

To retrieve their files, they used modern sequencing technology to read the DNA strands, followed by software to translate the genetic code back into binary.

They recovered their files with zero errors," the journal reads.

'Highest-Density Data-Storage Device Ever Created'

The researchers believe that DNA is the perfect storage medium - as it is ultra-compact and can last hundreds of thousands of years if kept cool and dry - and suggests this is the,

"highest-density data-storage device ever created."

Since the digital universe is large and by 2020 containing nearly as many digital bits as there are stars in the universe, the data will reach 44 zettabytes or 44 trillion gigabytes.

So, DNA data storage could help big organizations store an enormous amount of information in a way that one can still be able to read it in a hundred years.

However, cost is still an issue.

The researchers spent around $7,000 to synthesize the 2MB of data and another $2,000 to read that data. However, with the time this will change, so do not expect this technique to go mainstream anytime soon.

Return to The End of The Internet... As We Know It?

Return to Genoma