Slide 1

Slide 1 text

Pierre Crozet Laboratoire de Biologie Computationnelle et Quantitative Co-founder & Chief Technical Officer Biomemory DNA DRIVE: A new technology for sustainable data storage

Slide 2

Slide 2 text

Descriptive Analytic Synthetic

Slide 3

Slide 3 text

• Synthesis is a powerful method of investigation and understanding • Synthesis and Analysis are complementary SYNTHESIS ANALYSIS Simple Bricks Complex Object

Slide 4

Slide 4 text

“What I cannot create, I do not understand” Richard Feynman 1918-1988 Construction is the ultimate proof of understanding Thanks to the technological advances of the past 40 years, Biology can now construct to understand

Slide 5

Slide 5 text

Definition of Synthetic Biology Synthetic biology is the design and construction of new biological parts, devices, and systems such as enzymes, genetic circuits, and cells OR the redesign of existing biological systems.

Slide 6

Slide 6 text

Bray et al. 1995 Nature Benner et al. 2003 Nature build biological systems to Create artificial systems that have potential biotechnology and health applications Fundamental biological questions with new approaches and concepts Synthetic Biology A formal discipline of biological engineering

Slide 7

Slide 7 text

Synthetic Genomes Material Biomade Fabrics, coloring, textiles, bioplastics, bioconcrete Digital Cosmetics and perfumes Storage, biocomputing Health Immunotherapy, personnalized medicine, stem cells, microbiota, diagnosis, vaccines, drug bioproduction Feed & Food Coloring, aroma, conservatives, enzymes, cell agriculture, process engineering Environmental transition Biofuels, plateform molecules, sustainable development, bioremediation Perfumes, aroma, oils, collagene, squalene, specialty chemistry Bioconcrete bricks produced by bacteria Biofmade shoes: synthetic spider silk

Slide 8

Slide 8 text

Crozet et al. 2018 ACS Synth Biol Laboratoire de Biologie Computationnelle et Quantitative UMR7238, LCQB, IBPS, CNRS, Sorbonne Université Synthetic biology Modular cloning toolkit (MoClo) • Genetic reprogramming • Chlamydomonas = new photosynthetic chassis: sustainable synthetic biology • Technologies for DNA manipulation and assembly of long DNA molecules Design and redesign of biological systems 119 biobricks The team Synthetic biology of microalgae Stéphane Lemaire Directeur de Recherche CNRS Pierre Crozet Maître de conférences Sorbonne Université

Slide 9

Slide 9 text

DNA DRIVE: A new technology for sustainable data storage Pierre Crozet Laboratoire de Biologie Computationnelle et Quantitative Co-founder & Chief Technical Officer Biomemory

Slide 10

Slide 10 text

2 booming technologies • Big data : since 1997 • Artificial intelligence: since 1950, Alan Turing Image: Le Journal du CNRS Convergence à Digital Transformation Digital storage The era of big data and AI Data is the fuel of artificial intelligence à without data = no artificial intelligence

Slide 11

Slide 11 text

IDC prediction: Global Datasphere growth = from 45 Zettabytes in 2019 to 175 Zettabytes by 2025 *IDC = International Data Corporation. Digital storage The needs will explode 45 ZB

Slide 12

Slide 12 text

1 ZB = 1021 bytes IDC prediction: Global Datasphere growth = from 45 Zettabytes in 2019 to 175 Zettabytes by 2025 *IDC = International Data Corporation. Digital storage The needs will explode 45 ZB

Slide 13

Slide 13 text

1 ZB = 1021 bytes downloading 1 ZB Good optic fiber (100 Mbits/s) IDC prediction: Global Datasphere growth = from 45 Zettabytes in 2019 to 175 Zettabytes by 2025 *IDC = International Data Corporation. Digital storage The needs will explode 45 ZB

Slide 14

Slide 14 text

1 ZB = 1021 bytes downloading 1 ZB Good optic fiber (100 Mbits/s) IDC prediction: Global Datasphere growth = from 45 Zettabytes in 2019 to 175 Zettabytes by 2025 *IDC = International Data Corporation. 2,5 million years Digital storage The needs will explode 45 ZB

Slide 15

Slide 15 text

ß Annual datasphere growth between 2018 and 2025. (IDC report) Health Industry Finance Media 1 2 3 4 3) Finances 10 Zb in 2025 Data analysis (Market dynamics, insurances) Security and protection of data 4) Media and entertainment 6 Zb in 2025 Virtual reality/Augmented reality (VR/AR) Video Streaming Video games Social Networks 2) Industry 22 Zb in 2025 Internet of Things (IoT) Machines: computers, smartphones, consoles… Machine-machine or man-machine communication Automatizing 24h/24, autonomous cars (3 To/h/car) 1) Health &scientific research >10 Zb in 2025 Imaging, genomics, personalized medicine (microbiota) Astrophysics, Particle Physics, -Omics, Ecology, Environmental sciences Digital storage Growth of the datasphere

Slide 16

Slide 16 text

>70% world data are archives WORN are mainly stored on magnetic storage tapes Write Once Read Never Aim: replace magnetic tapes of cold archives by DNA data storage Digital storage WORN W.O.R.N. data?

Slide 17

Slide 17 text

Image: Xalima Inside a data center à Digital storage Current devices Fragile Lifespan: 5 to 7 years

Slide 18

Slide 18 text

Image: Pixabay 175 Zo 23 times (David Reinsel, IDC, Boston USA) World data centers: 167 km² (1.6 x Paris) 1 millionth of the world's land surface 2040 : 1 thousandth of the world's land surface Bulky Digital storage Current devices

Slide 19

Slide 19 text

Energy comsumption: 150 TWh / year 2% of the world electric consumption Carbon footprint: > world civil aviation Image: Ecomomie Matin Annual consumption of data centers = >30 nuclear power plants Energy consuming Digital storage Current devices

Slide 20

Slide 20 text

Fragile Bulky Energy consuming 2% of the world electric consumption Carbon footprint > civili aviation World data centers: 167 km² (1.6 x Paris) lifespan: 5 to 7 years ENVIRONMENTAL FOOTPRINT Digital storage Current devices

Slide 21

Slide 21 text

« If today we are capable of storing about 30% of the information we generate, in only 10 or 12 years we’ll be able to store about 3% […] » 21 Graphic: Statistica 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 14 800 exabytes 10 800 exabytes 20 zetabytes 10 zetabytes 30 zetabytes 40 zetabytes 2025 : 160-175 zetabytes Data storage demand Vs storage capacity: Current technologies will not be able to meet storage needs. Limited storage Dr Karin Strauss, Microsoft Research à In 2025 we will need to store 5 times more data than today. Since 2010, demand has exceeded storage supply. Digital storage Current devices

Slide 22

Slide 22 text

Image: Le Journal du CNRS The digital transformation needs a deep (r)evolution of our digital storage technologies. Digital transformation

Slide 23

Slide 23 text

Digital storage Another type of storage device? Another type of data storage device exists in Nature, which was not created by humans, and was improved for nearly 4 billion years

Slide 24

Slide 24 text

Image: Jeanne Le Peillet Human genome: 6.4 billion pairs of nucleotides per cell => Equivalent to 700 MB/cell 3900 billion cells/human Þ Equivalent to 2,7 ZB/human DNA: 4 letters: A, C, T, G Density: 50 atoms/letter Digital storage Another type of storage device? DNA Structure scheme 1 DNA molecule 1 nucleotide Phosphate group Ribose Scale: 1 nm = 10-9 m Example: a hair diameter is 50-100 nm

Slide 25

Slide 25 text

Bornholt et al., 2016 ß Stable for several thousand years Oldest complete genome : from a mammoth tooth 1.6 million years old. Durabilité : ~ 5 years ~ 5 years ~ 15-25 years Several centuries Much more stable than current storage systems à DNA Storage Longevity Digital storage Advantages of DNA

Slide 26

Slide 26 text

Maximum density 4.5x1020 bytes/gram of DNA (0,45 Zo) = 450 million TB/gram of DNA Compact Digital storage Advantages of DNA All the world data on DNA 45 Zo 100 g of DNA

Slide 27

Slide 27 text

Robust technologies for long-term ambient temperature storage of dried DNA by encapsulation. ß Stable at room temperature Energy efficient Digital storage Advantages of DNA DNA stability > 50 000 years

Slide 28

Slide 28 text

Energy efficient Stable at room temperature without energy input Stable for millenia An environmentally friendly solution for data storage No media obsolescence Oldest complete genome: from a mammoth tooth 1.6 million years old Compact Sustainable Digital storage Advantages of DNA

Slide 29

Slide 29 text

Richard Feynman State of the art The early stages of DNA storage Suggested as soon as 1959 by Richard Feynman, Nobel Prize in Physics, 1965 In his speech: "There's Plenty of Room at the Bottom: An Invitation to Enter a New Field of Physics". Richard Feynman, Classic Talk on December 29th 1959 at the annual meeting of the American Physical Society at California Institute of Technology

Slide 30

Slide 30 text

State of the art The early stages of DNA storage Density: 686 Terabytes per mm3 Chemical synthesis of DNA Church et al. 2012 Science 337:1628 1st signifiant demonstration: 2012, George Church, Harvard, USA

Slide 31

Slide 31 text

0 / 1 = binary A / C / G / T = quaternary Encoding Storage on oligo pools

Slide 32

Slide 32 text

0 / 1 = binary A / C / G / T = quaternary 2 bits/base A = 00 C = 01 T = 10 G = 11 Encoding Storage on oligo pools

Slide 33

Slide 33 text

1 bit/base A = C = 0 T = G = 1 0 / 1 = binary A / C / G / T = quaternary 2 bits/base A = 00 C = 01 T = 10 G = 11 Encoding Storage on oligo pools

Slide 34

Slide 34 text

Church et al., 2012 Science Oligos = short fragments of DNA • Non compatible with living systems • Obtained by chemical synthesis • Maximum 200 nucleotides State of the art Storage in oligonucleotide pools 650 kb => 54 898 oligonucleotides 159 nucleotides (A,T,G,C) Index (19 nt): information about localization in order to arrange fragments Data (96 nt) Common sequence (22 nt): for amplification and sequencing Structural scheme: Steps for storage in oligos:

Slide 35

Slide 35 text

Oligonucleotides 85-230 nt from 650 ko to 200 Mo Density 0.6 to 1.94 bits/nt State of the art 2012-2018 nt = nucleotide

Slide 36

Slide 36 text

State of the art Storage on oligo pools Organick et al., 2018 Nature Biotechnol Chemistry Mathematics Computing Mathematics Computing In vitro amplification Physics Chemistry 200 MB on 13.4 million oligonucleotides 2018

Slide 37

Slide 37 text

State of the art The limits of oligos for DNA storage Storage on oligonucleotide pools: - Stored and read in vitro - Based on mathematics, computing, chemistry and physics è Proof of concept of DNA storage Limitations: - High cost (>1000 $/Mo) - High error rates - Limited and costly edition and copy Images: Science et Avenir, Wikipédia Need for more efficient DNA storage systems Our vision: Ø Exploiting the potential of Nature Ø Using biology to remove the limitations of DNA storage

Slide 38

Slide 38 text

DNA in vivo Technologies for manipulating DNA Images : Random42, Neb, Gesundheitsindustrie, Phys DNA reading DNA copy Error correction DNA editing Signal amplification Flash random access

Slide 39

Slide 39 text

DNA in vivo Technologies for manipulating DNA Technologies for manipulation of DNA in living organisms: Images : Random42, Neb, Gesundheitsindustrie, Phys DNA reading DNA copy Error correction DNA editing Signal amplification Flash random access Domestication and adaptation new technologies of DNA storage Synthetic biology

Slide 40

Slide 40 text

Our strategy Storing digital information in molecules that can be manipulated by living organisms: biocompatible DNA Plasmids/chromosomes: circular and replicative double-stranded DNA => can be copied biologically at very low cost Oligonucleotides are not compatible with living organisms Oligonucleotide = single-stranded DNA molecules much less stable than double-stranded DNA Genetic information stored in double- stranded DNA molecules (double helix) Chromosome Plasmides Genetic information: linear or circular DNA molecules of large size (kb-Mb): plasmids, chromosomes Biomemory strategy biocompatible and biosafe DNA

Slide 41

Slide 41 text

Biocompatible DNA must be Biosafe In contrast to chemical synthesis, not all DNA sequences are possible, the sequence must be controlled: The DNA sequence must not be used by the host organism as genetic information, this sequence must not be expressed Biomemory strategy biocompatible and biosafe DNA • Limiting RNA production - Prohibition of transcription initiation sequences • Limiting protein production - Prohibition of translation initiation sequences no start codon - Addition of translation termination sequences High frequency of stop codons in all 6 reading frames • Limiting errors - Controlled %GC (45-55% GC) - Prohibition of homopolymeres (>3 nt) • Facilitate DNA manipulation Addition or removal of specific sequences restriction enzymes, recombination sites, …

Slide 42

Slide 42 text

Original archiving strategy Breaking with existing technologies (not based on oligonucleotides) • Long replicative fragments Plasmids/chromosomes • Reading long fragments NGS 3rd generation, Oxford Nanopore Pocket sequencer • Biocompatible and Biosafe à Biocompatible DNA Can be manipulated in vivo Copy, edition… à Biosafe DNA DNA encrypted for the host organism (biosafe) Biomemory strategy A bio-inspired DNA storage strategy Image: Pixabay

Slide 43

Slide 43 text

• Unlimited total capacity • Possibility of new biological approaches in vivo (copying, editing) => Allows copying at very low cost => High redundancy = low error rate • Compatible with all sequencing technologies • Binary file system with organization in physical DNA sectors allowing partitions, allocation table, file/directories, indexes, metadata... • Possibility of compressing files, random access and error correction. • Fully automatable (commercial robots) • Well-adapted for cold storage DNA DRIVE A bio-inspired DNA storage strategy

Slide 44

Slide 44 text

Hard Disk Physical organization Video: Animagraff

Slide 45

Slide 45 text

DNA Drive Physical organization

Slide 46

Slide 46 text

DNA Drive Physical organization

Slide 47

Slide 47 text

• Unlimited total capacity • Possibility of new biological approaches in vivo (copying, editing) => Allows copying at very low cost => High redundancy = low error rate • Compatible with all sequencing technologies • Binary file system with organization in physical DNA sectors allowing partitions, allocation table, file/directories, indexes, metadata... • Possibility of compressing files, random access and error correction. • Fully automatable (commercial robots) • Well-adapted for cold storage DNA DRIVE A bio-inspired DNA storage strategy DNA DRIVE Patent EP193062478 01/10/2019 PCT EP2020/077497 01/10/2020

Slide 48

Slide 48 text

Biomemory The start-up of bio-inspired DNA storage Stéphane Lemaire CSO Pierre Crozet CTO Erfane Arwani CEO Three co-founders Spin-off of Sorbonne Université & CNRS created in July 2021 The mission of Biomemory is to propose a DNA storage solution economically viable in data center with a null or negligible carbon footprint

Slide 49

Slide 49 text

Two historical texts: - Déclaration des Droits de l’Homme et du Citoyen (1789), registered in Mémoire du Monde program of UNESCO - Déclaration des Droits de la Femme et de la Citoyenne (1791, Olympe de Gouges) Pluridisciplinarity: Historians, philosophers, computer scientists, mathematicians, biologists DNA DRIVE The proof of concept • Humanist values • men/women equality At the interface between science and society

Slide 50

Slide 50 text

No content

Slide 51

Slide 51 text

Proof of concept DNA DRIVE Partners Institutional partners Archives Nationales Programme Mémoire du Monde de l'UNESCO Sorbonne Université & CNRS Industrial partners Twist Bioscience (San Francisco, USA) world leader in DNA synthesis Imagene (Bordeaux, Génopole) encapsulation of DNA in metallic capsules Biomemory Biomemory aims at developing and commercializing the DNA DRIVE technology

Slide 52

Slide 52 text

• Encoding of the file on DNA • Encapsulation • Several reads: 100% fidelity DNA DRIVE The proof of concept A capsule contains more than 100 billion copies of the file on DNA

Slide 53

Slide 53 text

DNA DRIVE The proof of concept • Encoding of the file on DNA • Encapsulation • Several reads: 100% fidelity

Slide 54

Slide 54 text

DNA DRIVE The proof of concept

Slide 55

Slide 55 text

CGCAACCGTCCGACTAGCTAAACGCAA CGTCAACAAGTCTCGCAAGTAACGTCC GACCCAACCCAAGTTGAGTTAGGAGA Biocompatible DNA 2. Writing Synthesis 3. Assembly 4. Selection & amplification (Bacteria Transformation) 5. DNA Extraction 6. Stock 7. Read Sequencing CGCAACCGTCCGACTAGCTA AACGCATAACGTCCGACCCA ACCCAAGTTGAGTTAGGAGA DNA Sequence 1. Encoding: Digital Compression + DNA Drive algorithm 010000011001001010100001000 011000000110101000110001100 10000000000011110111011010 Binary File 8. Decoding DNA Drive Algorithm + Digital decompression Déclaration des droits de l’Homme et du Citoyen Antoine Danon Jeanne Le Peillet Proof of concept DNA DRIVE

Slide 56

Slide 56 text

Official registration of the DNA-encoded texts at Archives Nationales World premiere for a public institution Image: Wikipédia Press conference at Hôtel de Soubise, Paris Tuesday 23 november 2021 Proof of concept DNA DRIVE World premiere

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

FOUNDERS MEMBERS Delivering Industry Technology Roadmap Create standards

Slide 59

Slide 59 text

CGCAACCGTCCGACTAGCTAAACGCAACG TCAACAAGTCTCGCAAGTAACGTCCGACC CAACCCAAGTTGAGTTAGGAGA Biocompatible DNA 2. Writing Synthesis 3. Assembly 4. Selection & amplification (Bacteria Transformation) 5. DNA Extraction 6. Stock 7. Read Sequencing CGCAACCGTCCGACTAGCTAAA CGCATAACGTCCGACCCAACCC AAGTTGAGTTAGGAGA DNA Sequence 1. Encoding: Digital Compression + DNA Drive algorithm 01000001100100101010000100001 10000001101010001100011001000 0000000011110111011010 Binary File 8. Decoding DNA Drive Algorithm + Digital decompression Antoine Danon Jeanne Le Peillet Digital files Others Biomemory Which part of the process are we managing?

Slide 60

Slide 60 text

Offline Data Preservation DNA backup of crypto assets designed to resist extreme conditions and be very durable. Traceability and authentication solutions. Currently talking with a world leader in security inks for banknotes and IDs. Marking & Authentication Archives and backups solutions for data centers. Offsite Cloud Storage Storage for the French state Biomemory Our use cases timeline

Slide 61

Slide 61 text

Biomemory Our vision A rackable DNA Data Storage Server Autonomous Read/Write Exabyte scale $1/terabyte Removable DNA Drive cartridges Removable DNA ink cartridges 4U rackable server for existing DCs No biological expert on site 17.7 cm 48.3 cm 35.5 cm

Slide 62

Slide 62 text

• the decree of abolition of slavery in 1848 currently exhibited at the hôtel de Soubise • the ordinance of 1944 which granted women the right to vote to be exhibited this fall • the Badinter law of 1981 which abolished the death penalty exhibited next spring The Révolution goes on… 100% biobased proprietary Biomemory technology

Slide 63

Slide 63 text

No content