1918-1988 Construction is the ultimate proof of understanding Thanks to the technological advances of the past 40 years, Biology can now construct to understand
construction of new biological parts, devices, and systems such as enzymes, genetic circuits, and cells OR the redesign of existing biological systems.
build biological systems to Create artificial systems that have potential biotechnology and health applications Fundamental biological questions with new approaches and concepts Synthetic Biology A formal discipline of biological engineering
Computationnelle et Quantitative UMR7238, LCQB, IBPS, CNRS, Sorbonne Université Synthetic biology Modular cloning toolkit (MoClo) • Genetic reprogramming • Chlamydomonas = new photosynthetic chassis: sustainable synthetic biology • Technologies for DNA manipulation and assembly of long DNA molecules Design and redesign of biological systems 119 biobricks The team Synthetic biology of microalgae Stéphane Lemaire Directeur de Recherche CNRS Pierre Crozet Maître de conférences Sorbonne Université
Artificial intelligence: since 1950, Alan Turing Image: Le Journal du CNRS Convergence à Digital Transformation Digital storage The era of big data and AI Data is the fuel of artificial intelligence à without data = no artificial intelligence
fiber (100 Mbits/s) IDC prediction: Global Datasphere growth = from 45 Zettabytes in 2019 to 175 Zettabytes by 2025 *IDC = International Data Corporation. Digital storage The needs will explode 45 ZB
fiber (100 Mbits/s) IDC prediction: Global Datasphere growth = from 45 Zettabytes in 2019 to 175 Zettabytes by 2025 *IDC = International Data Corporation. 2,5 million years Digital storage The needs will explode 45 ZB
Health Industry Finance Media 1 2 3 4 3) Finances 10 Zb in 2025 Data analysis (Market dynamics, insurances) Security and protection of data 4) Media and entertainment 6 Zb in 2025 Virtual reality/Augmented reality (VR/AR) Video Streaming Video games Social Networks 2) Industry 22 Zb in 2025 Internet of Things (IoT) Machines: computers, smartphones, consoles… Machine-machine or man-machine communication Automatizing 24h/24, autonomous cars (3 To/h/car) 1) Health &scientific research >10 Zb in 2025 Imaging, genomics, personalized medicine (microbiota) Astrophysics, Particle Physics, -Omics, Ecology, Environmental sciences Digital storage Growth of the datasphere
USA) World data centers: 167 km² (1.6 x Paris) 1 millionth of the world's land surface 2040 : 1 thousandth of the world's land surface Bulky Digital storage Current devices
electric consumption Carbon footprint: > world civil aviation Image: Ecomomie Matin Annual consumption of data centers = >30 nuclear power plants Energy consuming Digital storage Current devices
Carbon footprint > civili aviation World data centers: 167 km² (1.6 x Paris) lifespan: 5 to 7 years ENVIRONMENTAL FOOTPRINT Digital storage Current devices
of the information we generate, in only 10 or 12 years we’ll be able to store about 3% […] » 21 Graphic: Statistica 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 14 800 exabytes 10 800 exabytes 20 zetabytes 10 zetabytes 30 zetabytes 40 zetabytes 2025 : 160-175 zetabytes Data storage demand Vs storage capacity: Current technologies will not be able to meet storage needs. Limited storage Dr Karin Strauss, Microsoft Research à In 2025 we will need to store 5 times more data than today. Since 2010, demand has exceeded storage supply. Digital storage Current devices
nucleotides per cell => Equivalent to 700 MB/cell 3900 billion cells/human Þ Equivalent to 2,7 ZB/human DNA: 4 letters: A, C, T, G Density: 50 atoms/letter Digital storage Another type of storage device? DNA Structure scheme 1 DNA molecule 1 nucleotide Phosphate group Ribose Scale: 1 nm = 10-9 m Example: a hair diameter is 50-100 nm
Oldest complete genome : from a mammoth tooth 1.6 million years old. Durabilité : ~ 5 years ~ 5 years ~ 15-25 years Several centuries Much more stable than current storage systems à DNA Storage Longevity Digital storage Advantages of DNA
for millenia An environmentally friendly solution for data storage No media obsolescence Oldest complete genome: from a mammoth tooth 1.6 million years old Compact Sustainable Digital storage Advantages of DNA
DNA storage Suggested as soon as 1959 by Richard Feynman, Nobel Prize in Physics, 1965 In his speech: "There's Plenty of Room at the Bottom: An Invitation to Enter a New Field of Physics". Richard Feynman, Classic Talk on December 29th 1959 at the annual meeting of the American Physical Society at California Institute of Technology
Density: 686 Terabytes per mm3 Chemical synthesis of DNA Church et al. 2012 Science 337:1628 1st signifiant demonstration: 2012, George Church, Harvard, USA
DNA • Non compatible with living systems • Obtained by chemical synthesis • Maximum 200 nucleotides State of the art Storage in oligonucleotide pools 650 kb => 54 898 oligonucleotides 159 nucleotides (A,T,G,C) Index (19 nt): information about localization in order to arrange fragments Data (96 nt) Common sequence (22 nt): for amplification and sequencing Structural scheme: Steps for storage in oligos:
storage Storage on oligonucleotide pools: - Stored and read in vitro - Based on mathematics, computing, chemistry and physics è Proof of concept of DNA storage Limitations: - High cost (>1000 $/Mo) - High error rates - Limited and costly edition and copy Images: Science et Avenir, Wikipédia Need for more efficient DNA storage systems Our vision: Ø Exploiting the potential of Nature Ø Using biology to remove the limitations of DNA storage
of DNA in living organisms: Images : Random42, Neb, Gesundheitsindustrie, Phys DNA reading DNA copy Error correction DNA editing Signal amplification Flash random access Domestication and adaptation new technologies of DNA storage Synthetic biology
manipulated by living organisms: biocompatible DNA Plasmids/chromosomes: circular and replicative double-stranded DNA => can be copied biologically at very low cost Oligonucleotides are not compatible with living organisms Oligonucleotide = single-stranded DNA molecules much less stable than double-stranded DNA Genetic information stored in double- stranded DNA molecules (double helix) Chromosome Plasmides Genetic information: linear or circular DNA molecules of large size (kb-Mb): plasmids, chromosomes Biomemory strategy biocompatible and biosafe DNA
not all DNA sequences are possible, the sequence must be controlled: The DNA sequence must not be used by the host organism as genetic information, this sequence must not be expressed Biomemory strategy biocompatible and biosafe DNA • Limiting RNA production - Prohibition of transcription initiation sequences • Limiting protein production - Prohibition of translation initiation sequences no start codon - Addition of translation termination sequences High frequency of stop codons in all 6 reading frames • Limiting errors - Controlled %GC (45-55% GC) - Prohibition of homopolymeres (>3 nt) • Facilitate DNA manipulation Addition or removal of specific sequences restriction enzymes, recombination sites, …
oligonucleotides) • Long replicative fragments Plasmids/chromosomes • Reading long fragments NGS 3rd generation, Oxford Nanopore Pocket sequencer • Biocompatible and Biosafe à Biocompatible DNA Can be manipulated in vivo Copy, edition… à Biosafe DNA DNA encrypted for the host organism (biosafe) Biomemory strategy A bio-inspired DNA storage strategy Image: Pixabay
in vivo (copying, editing) => Allows copying at very low cost => High redundancy = low error rate • Compatible with all sequencing technologies • Binary file system with organization in physical DNA sectors allowing partitions, allocation table, file/directories, indexes, metadata... • Possibility of compressing files, random access and error correction. • Fully automatable (commercial robots) • Well-adapted for cold storage DNA DRIVE A bio-inspired DNA storage strategy
in vivo (copying, editing) => Allows copying at very low cost => High redundancy = low error rate • Compatible with all sequencing technologies • Binary file system with organization in physical DNA sectors allowing partitions, allocation table, file/directories, indexes, metadata... • Possibility of compressing files, random access and error correction. • Fully automatable (commercial robots) • Well-adapted for cold storage DNA DRIVE A bio-inspired DNA storage strategy DNA DRIVE Patent EP193062478 01/10/2019 PCT EP2020/077497 01/10/2020
Pierre Crozet CTO Erfane Arwani CEO Three co-founders Spin-off of Sorbonne Université & CNRS created in July 2021 The mission of Biomemory is to propose a DNA storage solution economically viable in data center with a null or negligible carbon footprint
du Citoyen (1789), registered in Mémoire du Monde program of UNESCO - Déclaration des Droits de la Femme et de la Citoyenne (1791, Olympe de Gouges) Pluridisciplinarity: Historians, philosophers, computer scientists, mathematicians, biologists DNA DRIVE The proof of concept • Humanist values • men/women equality At the interface between science and society
Programme Mémoire du Monde de l'UNESCO Sorbonne Université & CNRS Industrial partners Twist Bioscience (San Francisco, USA) world leader in DNA synthesis Imagene (Bordeaux, Génopole) encapsulation of DNA in metallic capsules Biomemory Biomemory aims at developing and commercializing the DNA DRIVE technology
4. Selection & amplification (Bacteria Transformation) 5. DNA Extraction 6. Stock 7. Read Sequencing CGCAACCGTCCGACTAGCTA AACGCATAACGTCCGACCCA ACCCAAGTTGAGTTAGGAGA DNA Sequence 1. Encoding: Digital Compression + DNA Drive algorithm 010000011001001010100001000 011000000110101000110001100 10000000000011110111011010 Binary File 8. Decoding DNA Drive Algorithm + Digital decompression Déclaration des droits de l’Homme et du Citoyen Antoine Danon Jeanne Le Peillet Proof of concept DNA DRIVE
premiere for a public institution Image: Wikipédia Press conference at Hôtel de Soubise, Paris Tuesday 23 november 2021 Proof of concept DNA DRIVE World premiere
4. Selection & amplification (Bacteria Transformation) 5. DNA Extraction 6. Stock 7. Read Sequencing CGCAACCGTCCGACTAGCTAAA CGCATAACGTCCGACCCAACCC AAGTTGAGTTAGGAGA DNA Sequence 1. Encoding: Digital Compression + DNA Drive algorithm 01000001100100101010000100001 10000001101010001100011001000 0000000011110111011010 Binary File 8. Decoding DNA Drive Algorithm + Digital decompression Antoine Danon Jeanne Le Peillet Digital files Others Biomemory Which part of the process are we managing?
resist extreme conditions and be very durable. Traceability and authentication solutions. Currently talking with a world leader in security inks for banknotes and IDs. Marking & Authentication Archives and backups solutions for data centers. Offsite Cloud Storage Storage for the French state Biomemory Our use cases timeline
Read/Write Exabyte scale $1/terabyte Removable DNA Drive cartridges Removable DNA ink cartridges 4U rackable server for existing DCs No biological expert on site 17.7 cm 48.3 cm 35.5 cm
exhibited at the hôtel de Soubise • the ordinance of 1944 which granted women the right to vote to be exhibited this fall • the Badinter law of 1981 which abolished the death penalty exhibited next spring The Révolution goes on… 100% biobased proprietary Biomemory technology