Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Serverless Data

Serverless Data

We introduce Serverless Data! Datasets that can be used without a developer setting up a dataset. However, the developer would also like any kind of flexibility to provide any kind of feature. Do data publishers then have to set up any kind of possible querying API? I don’t believe so: hypermedia data fragments to the rescue!

Download the full data dump of addresses in Flanders from: https://swarmstoraccountprod.blob.core.windows.net/oslo-ttl-dumps-production/adresregister.tar

Check out the autocompletion with fragments demo at http://193.190.127.164/site/index.html (if this link is broken, inform me: we’re going to give it a new home)

The specification: https://github.com/pietercolpaert/TREE

Full paper with benchmarks: http://pieter.pm/icwe2020-autocompletion/

Pieter Colpaert

February 05, 2020
Tweet

More Decks by Pieter Colpaert

Other Decks in Technology

Transcript

  1. Serverless data
    Pieter Colpaert
    #osloinspireert
    2020-02-05

    View full-size slide

  2. Serverless data
    Removing the need to set up a server when
    reusing a dataset

    View full-size slide

  3. What is the ultimate API for a base registry*?
    * Important source of authoritative governmental identifiers
    E.g., the officially registered addresses, or the IDs for companies and organizations

    View full-size slide

  4. Dataset
    Very specific API*
    Publisher
    3d party
    Awesome
    application
    * I’m looking at you:
    WMS, GraphQL, WFS, SPARQL,
    Cypher, ES, HTSQL, DJP, …

    View full-size slide

  5. Dataset
    Very specific API
    3d parties
    Awesome
    application
    Very specific API
    Awesome
    application
    ...
    ...
    Publisher

    View full-size slide

  6. Can you prepare APIs for all datasets
    for any use case?

    View full-size slide

  7. Base registry of addresses in Flanders (best-in-class example)
    :
    Doesn’t even have an API to have a simple typeahead

    View full-size slide

  8. Data dump
    Very specific API
    3d parties
    Awesome
    application
    Very specific API
    Awesome
    application
    ...
    ...
    We had to copy a dump and
    create our own API
    Publisher

    View full-size slide

  9. Data dump
    Very specific API
    3d parties
    Awesome
    application
    Very specific API
    Awesome
    application
    ...
    ...
    Welcome to the replication hell Publisher

    View full-size slide

  10. Solution?
    We must look at doing the right effort in the right place.

    View full-size slide

  11. 3d parties
    Serverless
    application
    Generic data fragments API
    Serverless
    application
    ...
    Dataset
    Query SDK Query SDK Query SDK
    Publisher

    View full-size slide

  12. The Flemish address registry: how to
    Dump
    Start populating first
    document until full...
    Raw data dump: https://data.vlaanderen.be/dumps

    View full-size slide

  13. DocumentRoot
    First 25 elements

    DocA
    25 elements that start with
    an A
    DocC
    Doc…
    DocAbdij
    DocAlbert
    DocAppe
    DocB
    Documenting a search tree with links

    View full-size slide

  14. Autocompleting Albert Street?
    Multiple HTTP requests needed
    DocumentRoot
    First 25 elements

    DocA
    25 elements that start with
    an A
    DocC
    Doc…
    DocAbdij
    DocAlbert
    DocAppe
    DocB

    View full-size slide

  15. Becomes possible with an incredibly easy to host API
    Generic API spec: github.com/pietercolpaert/TREE
    Same idea can be applied on
    - Geospatial search trees
    - Looking up data in a time interval
    - Graph patterns
    - …
    Full text search

    View full-size slide

  16. 3d parties
    Serverless
    application
    Generieke data-fragmenten API
    Serverless
    application
    Legacy API
    Data publisher
    dataset
    Query SDK Query SDK
    Legacy
    application
    But… Didn’t you just create
    yet another API instead of
    the ultimate one?
    Publisher

    View full-size slide

  17. 3d parties
    Serverless
    application
    Generic data fragments API
    Serverless
    application
    Legacy API
    Data publisher
    dataset
    Query SDK Query SDK
    Query SDK
    Legacy
    application
    Docker
    container
    In time, CPU consumption of legacy
    APIs can be moved to 3d parties
    Publisher

    View full-size slide

  18. Combine the right buildings blocks,
    and build the ultimate – and only – API
    for your base registry.
    https://pietercolpaert.be

    View full-size slide