Upgrade to Pro — share decks privately, control downloads, hide ads and more …

New deals on data - Generating open knowledge based on closed data

New deals on data - Generating open knowledge based on closed data

Talk hold at the "Blockchain for Science 2018" (Berlin).

Konrad Förstner

November 05, 2018
Tweet

More Decks by Konrad Förstner

Other Decks in Science

Transcript

  1. New deals on data – Generating open knowledge based on

    closed data Konrad U. Förstner ZB MED – Information Center for Life Sciences, Cologne, Germany & TH Köln, Cologne Germany November 5th, 2018, Blockchain for Science Con
  2. Disclaimer I have no to connection to any of the

    companies that I will be metioned here. I present my perspective as a bioinformatician and open science enthusiast. https://www.flickr.com/photos/redjar/113823307/ – CC-BY by flickr user redjar
  3. Open [data|source|*] should be the default in science. This is

    simply good scientific practice. https://www.flickr.com/photos/subcircle/500995147 – CC-BY by flickr user subcircle
  4. There are cases where privacy migh be a higher good

    than openess. Certain data should not be linked to individuals. https://commons.wikimedia.org/wiki/File:Masks_in_Venice.jpg CC-BY by Wikipedia user Rasevic
  5. Having access to the such data of a large popuplation

    would significantly help research and to extend our medicial knowledge. https://de.wikipedia.org/wiki/Datei:Crowd_at_Knebworth_House_-_Rolling_Stones_1976.jpg CC-BY by Wikimedia Commons Ibirapuera
  6. On the other hand the data can be misused for

    systematic discrimination due to political, ideological and commercial interests. https://www.flickr.com/photos/22394551@N03/2226095398 CC-BY by flickr user viZZZual.com
  7. We have moral dillemma. Protect individual rights or push the

    scientific progress. https://commons.wikimedia.org/wiki/File:Apothecary%27s_balance_with... CC-BY by Wikimedia Commons user Fæ
  8. Similar dilemmata from other research domains • Financial data of

    organisations • Energy consumption recording of devices • Location data of vehicles https://commons.wikimedia.org/wiki/File:Apothecary%27s_balance_with... CC-BY by Wikimedia Commons user Fæ
  9. Can we research based on black boxed data that is

    at least reproducible? https://commons.wikimedia.org/wiki/File:Eiserne_Truhe_Museum_Senftenberg.jpg PD
  10. Or can we at least use the data to generate

    hypthesis that then can be tested with complementary methods? https://commons.wikimedia.org/wiki/File:Eiserne_Truhe_Museum_Senftenberg.jpg PD
  11. Genomics England • Aims to hold 100,000 full genomes •

    Data processing in closed data centers • Only results leave the center via an ”airlock” https://de.wikipedia.org/wiki/Datei:Crowd_at_Knebworth_House_-_Rolling_Stones_1976.jpg CC-BY by Wikimedia Commons Ibirapuera
  12. Personal Health Train (PHT) • Data stations – (”FAIRports”) •

    Trains – Workflows that can work on the data provided to them https://de.wikipedia.org/wiki/Datei:Crowd_at_Knebworth_House_-_Rolling_Stones_1976.jpg CC-BY by Wikimedia Commons Ibirapuera
  13. (This slide was modified for online deposition - simply click

    on the link below; It is a news article that describes how 23andMe and other are selling genomic data to pharma industry.) https://www.businessinsider.de/dna-testing-delete-your-data-23andme-ancestry-2018-7
  14. Promises of blockchain-based, decentralized data marketplaces • owners have control

    over their data and can stay anonymous • standardisation of data • people can be incentivized to share the data • traceability (especially for pharmaceutical companies interesting) https://www.flickr.com/photos/katerha/4592429363 – CC-BY by flick user katerha
  15. Concepts of underlying solutions • Fully Homomorphic Encryption (FHE) •

    Multi-party Computation (MPC) • Trusted Execution Environment (TEE) like Intel SGX https://unsplash.com/@toddquackenbush?photo=IClZBVw5W5A - PD
  16. General purpose blockchain-based solutions • Ocean protocol • Enigma (secret

    contracts) • Ekiden protocol (Oasis Labs) • OpenMind https://unsplash.com/@toddquackenbush?photo=IClZBVw5W5A - PD
  17. Blockchain-based solutions for healthcare data • Nebula (by George Church)

    • Longenesis • Luna DNA • phrOS (Personal Health Record Operating System) • EncrypGen https://unsplash.com/@toddquackenbush?photo=IClZBVw5W5A - PD
  18. Currently lot of white papers available – nothing openly testable.

    https://www.flickr.com/photos/subcircle/500995147 – CC-BY by flickr user subcircle
  19. High risk – you won’t get your genome back once

    it leaked. https://www.flickr.com/photos/subcircle/500995147 – CC-BY by flickr user subcircle
  20. Implications for data owner/seller might be not clear – education

    needed. https://www.flickr.com/photos/subcircle/500995147 – CC-BY by flickr user subcircle
  21. Data stored off-chain = outsourcing of one important problem (suggestion

    like Dropbox metioned – IMO quite a bad idea) https://www.flickr.com/photos/subcircle/500995147 – CC-BY by flickr user subcircle
  22. How to avoid false statements in surveys to become interesting

    for data consumers? https://www.flickr.com/photos/subcircle/500995147 – CC-BY by flickr user subcircle
  23. Bottom line: Very promising, but a long and hard way

    to go. https://www.flickr.com/photos/subcircle/500995147 – CC-BY by flickr user subcircle
  24. What are your questions? konrad.foerstner.org / @konradfoerstner zbmed.de / @ZB_MED

    th-koeln.de / @th_koeln https://www.flickr.com/photos/nateone/3768979925/ – CC-BY by flick user nateone