Serverless Data

Serverless Data

We introduce Serverless Data! Datasets that can be used without a developer setting up a dataset. However, the developer would also like any kind of flexibility to provide any kind of feature. Do data publishers then have to set up any kind of possible querying API? I don’t believe so: hypermedia data fragments to the rescue!

Download the full data dump of addresses in Flanders from: https://swarmstoraccountprod.blob.core.windows.net/oslo-ttl-dumps-production/adresregister.tar

Check out the autocompletion with fragments demo at http://193.190.127.164/site/index.html (if this link is broken, inform me: we’re going to give it a new home)

The specification: https://github.com/pietercolpaert/TREE

Full paper with benchmarks: http://pieter.pm/icwe2020-autocompletion/

25b6db9c0680e598186d819051ad9e4b?s=128

Pieter Colpaert

February 05, 2020
Tweet

Transcript

  1. Serverless data Pieter Colpaert #osloinspireert 2020-02-05

  2. Serverless data Removing the need to set up a server

    when reusing a dataset
  3. What is the ultimate API for a base registry*? *

    Important source of authoritative governmental identifiers E.g., the officially registered addresses, or the IDs for companies and organizations
  4. Dataset Very specific API* Publisher 3d party Awesome application *

    I’m looking at you: WMS, GraphQL, WFS, SPARQL, Cypher, ES, HTSQL, DJP, …
  5. Dataset Very specific API 3d parties Awesome application Very specific

    API Awesome application ... ... Publisher
  6. Can you prepare APIs for all datasets for any use

    case?
  7. Base registry of addresses in Flanders (best-in-class example) : Doesn’t

    even have an API to have a simple typeahead
  8. Data dump Very specific API 3d parties Awesome application Very

    specific API Awesome application ... ... We had to copy a dump and create our own API Publisher
  9. Data dump Very specific API 3d parties Awesome application Very

    specific API Awesome application ... ... Welcome to the replication hell Publisher
  10. Solution? We must look at doing the right effort in

    the right place.
  11. 3d parties Serverless application Generic data fragments API Serverless application

    ... Dataset Query SDK Query SDK Query SDK Publisher
  12. The Flemish address registry: how to Dump Start populating first

    document until full... Raw data dump: https://data.vlaanderen.be/dumps
  13. DocumentRoot First 25 elements … DocA 25 elements that start

    with an A DocC Doc… DocAbdij DocAlbert DocAppe DocB Documenting a search tree with links
  14. Autocompleting Albert Street? Multiple HTTP requests needed DocumentRoot First 25

    elements … DocA 25 elements that start with an A DocC Doc… DocAbdij DocAlbert DocAppe DocB
  15. Demo

  16. Becomes possible with an incredibly easy to host API Generic

    API spec: github.com/pietercolpaert/TREE Same idea can be applied on - Geospatial search trees - Looking up data in a time interval - Graph patterns - … Full text search
  17. 3d parties Serverless application Generieke data-fragmenten API Serverless application Legacy

    API Data publisher dataset Query SDK Query SDK Legacy application But… Didn’t you just create yet another API instead of the ultimate one? Publisher
  18. 3d parties Serverless application Generic data fragments API Serverless application

    Legacy API Data publisher dataset Query SDK Query SDK Query SDK Legacy application Docker container In time, CPU consumption of legacy APIs can be moved to 3d parties Publisher
  19. Combine the right buildings blocks, and build the ultimate –

    and only – API for your base registry. https://pietercolpaert.be