Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ric Roberts: Data Platforms and the Data Value Chain

Swirrl
June 15, 2017

Ric Roberts: Data Platforms and the Data Value Chain

This deck supports Ric's talk about the wider perspective of data publishing, where the act of publishing itself sits within that, what it entails and the advantages it brings. For all talks from the day, head to: http://power-of-data-2017.swirrl.com/

Swirrl

June 15, 2017
Tweet

More Decks by Swirrl

Other Decks in Technology

Transcript

  1. Data Platforms
    and the
    Data Value Chain
    CTO, @RicRoberts

    View Slide

  2. data

    View Slide

  3. data
    You are here.

    View Slide

  4. profit!

    View Slide

  5. profit!
    You want to be here.

    View Slide

  6. profit!
    Examples
    • Deciding the best place to put a new school
    • Benchmarking hospitals
    • Working out the impact of poor air quality on health
    • Calculating the cost of increased flood risk

    View Slide

  7. View Slide

  8. data profit!
    ?

    View Slide

  9. data
    collect
    clean
    curate

    View Slide

  10. Examples
    • Surveys of various sorts
    • Administrative systems
    • Sensors e.g. detecting river flow
    • Social media
    data
    collect
    clean
    curate

    View Slide

  11. data
    ?
    profit!

    View Slide

  12. data
    use
    profit!

    View Slide

  13. use
    • exploring
    • filtering
    • aggregating
    • downloading
    • exporting
    • analysing (data science!)
    • generating reports (xls, pdf, doc, ppt)
    • using it in interactive apps or visualisations
    • sharing results

    View Slide

  14. data
    ?
    use
    profit!

    View Slide

  15. data
    connect
    use
    profit!

    View Slide

  16. connect
    • A common set of names for the things in the data.
    • A shared, documented and understood model of the data.
    • An agreed set of technologies for communicating and
    manipulating the data (standards!).
    • The data needs to be in a place people can get to it, in an
    relevant format (with a licence).

    View Slide

  17. https://www.flickr.com/photos/kewl/7006904747

    View Slide

  18. connect
    In computing, linked data is a method of publishing
    structured data so that it can be interlinked and become
    more useful through semantic queries. It builds upon
    standard Web technologies such as HTTP, RDF and URIs, but
    rather than using them to serve web pages for human
    readers, it extends them to share information in a way that
    can be read automatically by computers. This enables data
    from different sources to be connected and queried.
    — Wikipedia

    View Slide

  19. connect
    • A common set of names for the things in the data.
    • A common set of names for the things in the data.
    • An agreed set of technologies for communicating and
    manipulating the data (standards!).
    • The data needs to be in a place people can get to it, in an
    relevant format (with a licence).

    View Slide

  20. connect
    • A common set of names for the things in the data.
    • A shared, documented and understood model of the data.
    • An agreed set of technologies for communicating and
    manipulating the data (standards!).
    • The data needs to be in a place people can get to it, in an
    relevant format (with a licence).

    View Slide

  21. connect
    • A common set of names for the things in the data.
    • A shared, documented and understood model of the data.
    • An agreed set of technologies for communicating and
    manipulating the data (standards!).
    • The data needs to be in a place people can get to it, in an
    relevant format (with a licence).

    View Slide

  22. connect
    • A common set of names for the things in the data.
    • A shared, documented and understood model of the data.
    • An agreed set of technologies for communicating and
    manipulating the data (standards!).
    • The data needs to be in a place people can get to it, in an
    relevant format (with a licence).

    View Slide

  23. https://www.flickr.com/photos/iwannt/8596885627

    View Slide

  24. data
    ?
    connect
    use
    profit!

    View Slide

  25. data
    publish
    connect
    use
    profit!

    View Slide

  26. swirrl.com

    View Slide

  27. publish An (RDF) Graph Store
    Apache Jena

    View Slide

  28. publish Extract, Transform, Load (ETL)
    grafter.org github.com/swirrl/grafter

    View Slide

  29. publish Drafting and publication workflow

    View Slide

  30. publish A User Interface

    View Slide

  31. publish APIs

    View Slide

  32. View Slide

  33. data
    publish
    connect
    use
    profit!
    collect
    clean
    curate
    What’s limiting the effectiveness of
    this value chain?
    • Cottage industry of skilled individuals
    • Data preparation is not always considering bigger picture
    • Those expending the costs != those reaping the benefits
    • Availability of skilled data analysts
    • Lack of guidance and standardisation

    View Slide

  34. data
    publish
    connect
    use
    profit!
    collect
    clean
    curate

    View Slide

  35. Data Platforms
    and the
    Data Value Chain
    CTO, @RicRoberts

    View Slide