Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Biomemory - IT Press Tour #45 - Paris, France - Sept. 2022

Biomemory - IT Press Tour #45 - Paris, France - Sept. 2022

The IT Press Tour

September 07, 2022
Tweet

More Decks by The IT Press Tour

Other Decks in Technology

Transcript

  1. Pierre Crozet Laboratoire de Biologie Computationnelle et Quantitative Co-founder &

    Chief Technical Officer Biomemory DNA DRIVE: A new technology for sustainable data storage
  2. • Synthesis is a powerful method of investigation and understanding

    • Synthesis and Analysis are complementary SYNTHESIS ANALYSIS Simple Bricks Complex Object
  3. “What I cannot create, I do not understand” Richard Feynman

    1918-1988 Construction is the ultimate proof of understanding Thanks to the technological advances of the past 40 years, Biology can now construct to understand
  4. Definition of Synthetic Biology Synthetic biology is the design and

    construction of new biological parts, devices, and systems such as enzymes, genetic circuits, and cells OR the redesign of existing biological systems.
  5. Bray et al. 1995 Nature Benner et al. 2003 Nature

    build biological systems to Create artificial systems that have potential biotechnology and health applications Fundamental biological questions with new approaches and concepts Synthetic Biology A formal discipline of biological engineering
  6. Synthetic Genomes Material Biomade Fabrics, coloring, textiles, bioplastics, bioconcrete Digital

    Cosmetics and perfumes Storage, biocomputing Health Immunotherapy, personnalized medicine, stem cells, microbiota, diagnosis, vaccines, drug bioproduction Feed & Food Coloring, aroma, conservatives, enzymes, cell agriculture, process engineering Environmental transition Biofuels, plateform molecules, sustainable development, bioremediation Perfumes, aroma, oils, collagene, squalene, specialty chemistry Bioconcrete bricks produced by bacteria Biofmade shoes: synthetic spider silk
  7. Crozet et al. 2018 ACS Synth Biol Laboratoire de Biologie

    Computationnelle et Quantitative UMR7238, LCQB, IBPS, CNRS, Sorbonne Université Synthetic biology Modular cloning toolkit (MoClo) • Genetic reprogramming • Chlamydomonas = new photosynthetic chassis: sustainable synthetic biology • Technologies for DNA manipulation and assembly of long DNA molecules Design and redesign of biological systems 119 biobricks The team Synthetic biology of microalgae Stéphane Lemaire Directeur de Recherche CNRS Pierre Crozet Maître de conférences Sorbonne Université
  8. DNA DRIVE: A new technology for sustainable data storage Pierre

    Crozet Laboratoire de Biologie Computationnelle et Quantitative Co-founder & Chief Technical Officer Biomemory
  9. 2 booming technologies • Big data : since 1997 •

    Artificial intelligence: since 1950, Alan Turing Image: Le Journal du CNRS Convergence à Digital Transformation Digital storage The era of big data and AI Data is the fuel of artificial intelligence à without data = no artificial intelligence
  10. IDC prediction: Global Datasphere growth = from 45 Zettabytes in

    2019 to 175 Zettabytes by 2025 *IDC = International Data Corporation. Digital storage The needs will explode 45 ZB
  11. 1 ZB = 1021 bytes IDC prediction: Global Datasphere growth

    = from 45 Zettabytes in 2019 to 175 Zettabytes by 2025 *IDC = International Data Corporation. Digital storage The needs will explode 45 ZB
  12. 1 ZB = 1021 bytes downloading 1 ZB Good optic

    fiber (100 Mbits/s) IDC prediction: Global Datasphere growth = from 45 Zettabytes in 2019 to 175 Zettabytes by 2025 *IDC = International Data Corporation. Digital storage The needs will explode 45 ZB
  13. 1 ZB = 1021 bytes downloading 1 ZB Good optic

    fiber (100 Mbits/s) IDC prediction: Global Datasphere growth = from 45 Zettabytes in 2019 to 175 Zettabytes by 2025 *IDC = International Data Corporation. 2,5 million years Digital storage The needs will explode 45 ZB
  14. ß Annual datasphere growth between 2018 and 2025. (IDC report)

    Health Industry Finance Media 1 2 3 4 3) Finances 10 Zb in 2025 Data analysis (Market dynamics, insurances) Security and protection of data 4) Media and entertainment 6 Zb in 2025 Virtual reality/Augmented reality (VR/AR) Video Streaming Video games Social Networks 2) Industry 22 Zb in 2025 Internet of Things (IoT) Machines: computers, smartphones, consoles… Machine-machine or man-machine communication Automatizing 24h/24, autonomous cars (3 To/h/car) 1) Health &scientific research >10 Zb in 2025 Imaging, genomics, personalized medicine (microbiota) Astrophysics, Particle Physics, -Omics, Ecology, Environmental sciences Digital storage Growth of the datasphere
  15. >70% world data are archives WORN are mainly stored on

    magnetic storage tapes Write Once Read Never Aim: replace magnetic tapes of cold archives by DNA data storage Digital storage WORN W.O.R.N. data?
  16. Image: Xalima Inside a data center à Digital storage Current

    devices Fragile Lifespan: 5 to 7 years
  17. Image: Pixabay 175 Zo 23 times (David Reinsel, IDC, Boston

    USA) World data centers: 167 km² (1.6 x Paris) 1 millionth of the world's land surface 2040 : 1 thousandth of the world's land surface Bulky Digital storage Current devices
  18. Energy comsumption: 150 TWh / year 2% of the world

    electric consumption Carbon footprint: > world civil aviation Image: Ecomomie Matin Annual consumption of data centers = >30 nuclear power plants Energy consuming Digital storage Current devices
  19. Fragile Bulky Energy consuming 2% of the world electric consumption

    Carbon footprint > civili aviation World data centers: 167 km² (1.6 x Paris) lifespan: 5 to 7 years ENVIRONMENTAL FOOTPRINT Digital storage Current devices
  20. « If today we are capable of storing about 30%

    of the information we generate, in only 10 or 12 years we’ll be able to store about 3% […] » 21 Graphic: Statistica 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 14 800 exabytes 10 800 exabytes 20 zetabytes 10 zetabytes 30 zetabytes 40 zetabytes 2025 : 160-175 zetabytes Data storage demand Vs storage capacity: Current technologies will not be able to meet storage needs. Limited storage Dr Karin Strauss, Microsoft Research à In 2025 we will need to store 5 times more data than today. Since 2010, demand has exceeded storage supply. Digital storage Current devices
  21. Image: Le Journal du CNRS The digital transformation needs a

    deep (r)evolution of our digital storage technologies. Digital transformation
  22. Digital storage Another type of storage device? Another type of

    data storage device exists in Nature, which was not created by humans, and was improved for nearly 4 billion years
  23. Image: Jeanne Le Peillet Human genome: 6.4 billion pairs of

    nucleotides per cell => Equivalent to 700 MB/cell 3900 billion cells/human Þ Equivalent to 2,7 ZB/human DNA: 4 letters: A, C, T, G Density: 50 atoms/letter Digital storage Another type of storage device? DNA Structure scheme 1 DNA molecule 1 nucleotide Phosphate group Ribose Scale: 1 nm = 10-9 m Example: a hair diameter is 50-100 nm
  24. Bornholt et al., 2016 ß Stable for several thousand years

    Oldest complete genome : from a mammoth tooth 1.6 million years old. Durabilité : ~ 5 years ~ 5 years ~ 15-25 years Several centuries Much more stable than current storage systems à DNA Storage Longevity Digital storage Advantages of DNA
  25. Maximum density 4.5x1020 bytes/gram of DNA (0,45 Zo) = 450

    million TB/gram of DNA Compact Digital storage Advantages of DNA All the world data on DNA 45 Zo 100 g of DNA
  26. Robust technologies for long-term ambient temperature storage of dried DNA

    by encapsulation. ß Stable at room temperature Energy efficient Digital storage Advantages of DNA DNA stability > 50 000 years
  27. Energy efficient Stable at room temperature without energy input Stable

    for millenia An environmentally friendly solution for data storage No media obsolescence Oldest complete genome: from a mammoth tooth 1.6 million years old Compact Sustainable Digital storage Advantages of DNA
  28. Richard Feynman State of the art The early stages of

    DNA storage Suggested as soon as 1959 by Richard Feynman, Nobel Prize in Physics, 1965 In his speech: "There's Plenty of Room at the Bottom: An Invitation to Enter a New Field of Physics". Richard Feynman, Classic Talk on December 29th 1959 at the annual meeting of the American Physical Society at California Institute of Technology
  29. State of the art The early stages of DNA storage

    Density: 686 Terabytes per mm3 Chemical synthesis of DNA Church et al. 2012 Science 337:1628 1st signifiant demonstration: 2012, George Church, Harvard, USA
  30. 0 / 1 = binary A / C / G

    / T = quaternary Encoding Storage on oligo pools
  31. 0 / 1 = binary A / C / G

    / T = quaternary 2 bits/base A = 00 C = 01 T = 10 G = 11 Encoding Storage on oligo pools
  32. 1 bit/base A = C = 0 T = G

    = 1 0 / 1 = binary A / C / G / T = quaternary 2 bits/base A = 00 C = 01 T = 10 G = 11 Encoding Storage on oligo pools
  33. Church et al., 2012 Science Oligos = short fragments of

    DNA • Non compatible with living systems • Obtained by chemical synthesis • Maximum 200 nucleotides State of the art Storage in oligonucleotide pools 650 kb => 54 898 oligonucleotides 159 nucleotides (A,T,G,C) Index (19 nt): information about localization in order to arrange fragments Data (96 nt) Common sequence (22 nt): for amplification and sequencing Structural scheme: Steps for storage in oligos:
  34. Oligonucleotides 85-230 nt from 650 ko to 200 Mo Density

    0.6 to 1.94 bits/nt State of the art 2012-2018 nt = nucleotide
  35. State of the art Storage on oligo pools Organick et

    al., 2018 Nature Biotechnol Chemistry Mathematics Computing Mathematics Computing In vitro amplification Physics Chemistry 200 MB on 13.4 million oligonucleotides 2018
  36. State of the art The limits of oligos for DNA

    storage Storage on oligonucleotide pools: - Stored and read in vitro - Based on mathematics, computing, chemistry and physics è Proof of concept of DNA storage Limitations: - High cost (>1000 $/Mo) - High error rates - Limited and costly edition and copy Images: Science et Avenir, Wikipédia Need for more efficient DNA storage systems Our vision: Ø Exploiting the potential of Nature Ø Using biology to remove the limitations of DNA storage
  37. DNA in vivo Technologies for manipulating DNA Images : Random42,

    Neb, Gesundheitsindustrie, Phys DNA reading DNA copy Error correction DNA editing Signal amplification Flash random access
  38. DNA in vivo Technologies for manipulating DNA Technologies for manipulation

    of DNA in living organisms: Images : Random42, Neb, Gesundheitsindustrie, Phys DNA reading DNA copy Error correction DNA editing Signal amplification Flash random access Domestication and adaptation new technologies of DNA storage Synthetic biology
  39. Our strategy Storing digital information in molecules that can be

    manipulated by living organisms: biocompatible DNA Plasmids/chromosomes: circular and replicative double-stranded DNA => can be copied biologically at very low cost Oligonucleotides are not compatible with living organisms Oligonucleotide = single-stranded DNA molecules much less stable than double-stranded DNA Genetic information stored in double- stranded DNA molecules (double helix) Chromosome Plasmides Genetic information: linear or circular DNA molecules of large size (kb-Mb): plasmids, chromosomes Biomemory strategy biocompatible and biosafe DNA
  40. Biocompatible DNA must be Biosafe In contrast to chemical synthesis,

    not all DNA sequences are possible, the sequence must be controlled: The DNA sequence must not be used by the host organism as genetic information, this sequence must not be expressed Biomemory strategy biocompatible and biosafe DNA • Limiting RNA production - Prohibition of transcription initiation sequences • Limiting protein production - Prohibition of translation initiation sequences no start codon - Addition of translation termination sequences High frequency of stop codons in all 6 reading frames • Limiting errors - Controlled %GC (45-55% GC) - Prohibition of homopolymeres (>3 nt) • Facilitate DNA manipulation Addition or removal of specific sequences restriction enzymes, recombination sites, …
  41. Original archiving strategy Breaking with existing technologies (not based on

    oligonucleotides) • Long replicative fragments Plasmids/chromosomes • Reading long fragments NGS 3rd generation, Oxford Nanopore Pocket sequencer • Biocompatible and Biosafe à Biocompatible DNA Can be manipulated in vivo Copy, edition… à Biosafe DNA DNA encrypted for the host organism (biosafe) Biomemory strategy A bio-inspired DNA storage strategy Image: Pixabay
  42. • Unlimited total capacity • Possibility of new biological approaches

    in vivo (copying, editing) => Allows copying at very low cost => High redundancy = low error rate • Compatible with all sequencing technologies • Binary file system with organization in physical DNA sectors allowing partitions, allocation table, file/directories, indexes, metadata... • Possibility of compressing files, random access and error correction. • Fully automatable (commercial robots) • Well-adapted for cold storage DNA DRIVE A bio-inspired DNA storage strategy
  43. • Unlimited total capacity • Possibility of new biological approaches

    in vivo (copying, editing) => Allows copying at very low cost => High redundancy = low error rate • Compatible with all sequencing technologies • Binary file system with organization in physical DNA sectors allowing partitions, allocation table, file/directories, indexes, metadata... • Possibility of compressing files, random access and error correction. • Fully automatable (commercial robots) • Well-adapted for cold storage DNA DRIVE A bio-inspired DNA storage strategy DNA DRIVE Patent EP193062478 01/10/2019 PCT EP2020/077497 01/10/2020
  44. Biomemory The start-up of bio-inspired DNA storage Stéphane Lemaire CSO

    Pierre Crozet CTO Erfane Arwani CEO Three co-founders Spin-off of Sorbonne Université & CNRS created in July 2021 The mission of Biomemory is to propose a DNA storage solution economically viable in data center with a null or negligible carbon footprint
  45. Two historical texts: - Déclaration des Droits de l’Homme et

    du Citoyen (1789), registered in Mémoire du Monde program of UNESCO - Déclaration des Droits de la Femme et de la Citoyenne (1791, Olympe de Gouges) Pluridisciplinarity: Historians, philosophers, computer scientists, mathematicians, biologists DNA DRIVE The proof of concept • Humanist values • men/women equality At the interface between science and society
  46. Proof of concept DNA DRIVE Partners Institutional partners Archives Nationales

    Programme Mémoire du Monde de l'UNESCO Sorbonne Université & CNRS Industrial partners Twist Bioscience (San Francisco, USA) world leader in DNA synthesis Imagene (Bordeaux, Génopole) encapsulation of DNA in metallic capsules Biomemory Biomemory aims at developing and commercializing the DNA DRIVE technology
  47. • Encoding of the file on DNA • Encapsulation •

    Several reads: 100% fidelity DNA DRIVE The proof of concept A capsule contains more than 100 billion copies of the file on DNA
  48. DNA DRIVE The proof of concept • Encoding of the

    file on DNA • Encapsulation • Several reads: 100% fidelity
  49. CGCAACCGTCCGACTAGCTAAACGCAA CGTCAACAAGTCTCGCAAGTAACGTCC GACCCAACCCAAGTTGAGTTAGGAGA Biocompatible DNA 2. Writing Synthesis 3. Assembly

    4. Selection & amplification (Bacteria Transformation) 5. DNA Extraction 6. Stock 7. Read Sequencing CGCAACCGTCCGACTAGCTA AACGCATAACGTCCGACCCA ACCCAAGTTGAGTTAGGAGA DNA Sequence 1. Encoding: Digital Compression + DNA Drive algorithm 010000011001001010100001000 011000000110101000110001100 10000000000011110111011010 Binary File 8. Decoding DNA Drive Algorithm + Digital decompression Déclaration des droits de l’Homme et du Citoyen Antoine Danon Jeanne Le Peillet Proof of concept DNA DRIVE
  50. Official registration of the DNA-encoded texts at Archives Nationales World

    premiere for a public institution Image: Wikipédia Press conference at Hôtel de Soubise, Paris Tuesday 23 november 2021 Proof of concept DNA DRIVE World premiere
  51. CGCAACCGTCCGACTAGCTAAACGCAACG TCAACAAGTCTCGCAAGTAACGTCCGACC CAACCCAAGTTGAGTTAGGAGA Biocompatible DNA 2. Writing Synthesis 3. Assembly

    4. Selection & amplification (Bacteria Transformation) 5. DNA Extraction 6. Stock 7. Read Sequencing CGCAACCGTCCGACTAGCTAAA CGCATAACGTCCGACCCAACCC AAGTTGAGTTAGGAGA DNA Sequence 1. Encoding: Digital Compression + DNA Drive algorithm 01000001100100101010000100001 10000001101010001100011001000 0000000011110111011010 Binary File 8. Decoding DNA Drive Algorithm + Digital decompression Antoine Danon Jeanne Le Peillet Digital files Others Biomemory Which part of the process are we managing?
  52. Offline Data Preservation DNA backup of crypto assets designed to

    resist extreme conditions and be very durable. Traceability and authentication solutions. Currently talking with a world leader in security inks for banknotes and IDs. Marking & Authentication Archives and backups solutions for data centers. Offsite Cloud Storage Storage for the French state Biomemory Our use cases timeline
  53. Biomemory Our vision A rackable DNA Data Storage Server Autonomous

    Read/Write Exabyte scale $1/terabyte Removable DNA Drive cartridges Removable DNA ink cartridges 4U rackable server for existing DCs No biological expert on site 17.7 cm 48.3 cm 35.5 cm
  54. • the decree of abolition of slavery in 1848 currently

    exhibited at the hôtel de Soubise • the ordinance of 1944 which granted women the right to vote to be exhibited this fall • the Badinter law of 1981 which abolished the death penalty exhibited next spring The Révolution goes on… 100% biobased proprietary Biomemory technology