Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Offline-First: Architektur-Patterns & Implementierung für APIs & Backends

Offline-First: Architektur-Patterns & Implementierung für APIs & Backends

"Keine Internetverbindung" – ein Satz, den wir alle aus vielen Apps kennen, sei es Desktopanwendung oder Smartphone App. Oftmals sind die Clients auf dem jeweiligen Zielgerät installiert, bedienen sich aber einer externen Datenquelle, bspw. in Form eines HTTP APIs. Wie man sein Backend um eine offline-fähige Architektur erweitert, um es bspw. von einer SPA/PWA nutzen zu können, zeigen Thomas Hilzendegen und Manuel Rauber von Thinktecture in diesem Workshop. Denn bei einer offline-fähigen Implementierung sind nicht nur Lese-, sondern vor allem auch Schreibzugriffe wichtig. Sprich, die Daten müssen in beide Richtungen synchronisiert werden können, sobald die Internetverbindung wiederhergestellt wurde. Allgemein werden Themen im API Design, u.a. Synchronisationsstrategien von Anwendungs- und großen Binärdaten, Konfliktmanagement oder Datenbankarchitektur – sowohl für neue als auch bestehende Anwendungen – diskutiert. Am Beispiel einer PWA mit einem Backend in Java als auch in .NET Core werden Client und Server zur Umsetzung der besprochenen Konzepte und Patterns gezeigt.

GitHub: https://github.com/thinktecture/thinktecture-boardist

Manuel Rauber

December 11, 2019
Tweet

More Decks by Manuel Rauber

Other Decks in Programming

Transcript

  1. Thomas Hilzendegen
    @hilzendegent
    Consultant
    Manuel Rauber
    @ManuelRauber
    Consultant
    Architektur-Patterns & Implementierung
    Offline First APIs & Backends

    View Slide

  2. Thomas Hilzendegen
    Speakers
    Manuel Rauber
    [email protected]
    @hilzendegent
    [email protected]
    @manuelrauber
    https://manuel-rauber.com
    Microsoft MVP

    View Slide

  3. Timetable
    Time Doing
    09:00 - 10:30 Part I
    10:30 - 11:00 ☕
    11:00 - 12:30 Part II

    View Slide

  4. • Progressive Web App
    • Offline availability
    • Offline synchronization
    • Change tracking
    • Security considerations
    • Pitfalls
    Agenda

    View Slide

  5. Intro

    View Slide

  6. Offline… but why?

    View Slide

  7. • Just because the phone has a connection,
    e.g. Edge, it does not mean we are online
    • Depends on the use case which connection
    quality is required to determine, if the
    app is reliable online
    • Connection quality could be measured
    by the time an exclusive request takes
    • Duration < 150 ms: online
    • Duration !>=150 ms: bad connection
    What does “offline” mean?

    View Slide

  8. • A lot of reasons to be offline
    • Traveling, Train, Flights (bad or no signal)
    • Server is not available
    • Routing problems
    • Roaming
    • Costs
    Motivation

    View Slide

  9. Demo Application

    View Slide

  10. • A little application to manage boardgames
    • Angular, .NET Core, MS SQL Server
    • Brownfield application, started as a pure online application
    • As typical for any good demo: no security
    • Available on Azure: https://tt-boardist.azurewebsites.net
    • GitHub: https://github.com/thinktecture/thinktecture-boardist
    Demo Application “Thinktecture Boardist”

    View Slide

  11. • Backend
    • .NET Core 3.0
    • SQL Server
    • Frontend
    • Angular 8.2.13
    Demo Application Specs

    View Slide

  12. Demo Application Database Scheme

    View Slide

  13. Live Demo

    View Slide

  14. Now let’s go offline!

    View Slide

  15. Does it work in flight mode?
    NOPE!

    View Slide

  16. Progressive Web App

    View Slide

  17. Live Demo

    View Slide

  18. Does it work offline now?
    STILL NOPE!

    View Slide

  19. Real offline availability

    View Slide

  20. • ServiceWorker is only able to take data offline which has been requested by
    the application
    • If all the URLs are known beforehand, the ServiceWorker could cache them
    all
    • Data which was not requested, is not available offline (no real offline
    synchronisation)
    • But what about … the Background Sync API?
    What’s the problem?

    View Slide

  21. • Name is misleading
    • Does not offer any data synchronisation possibilities
    • Just raises an event periodically or after a delay (optionally based on the
    network type)
    • It’s totally up to the developer what to do in the onsync event
    • Since the sync is done in the ServiceWorker, the page can be closed, the
    sync will be fulfilled anyway
    • It does not help anything with syncing your actual data!
    Background Sync API

    View Slide

  22. Background Sync API

    View Slide

  23. • Cookies (not meant for large data or binaries)
    • Web Storage like session storage or local storage (not meant for large data
    or binaries)
    • IndexedDB
    • Cache Storage (based on request/response)
    Storage capabilities

    View Slide

  24. User can clear anything
    at any time by
    “Remove temporary internet files”
    or “Clear website data”

    View Slide

  25. • Key-value database of the browser
    • Stores data permanently
    • ServiceWorker and Web App share access to the same IndexedDB
    • Possibility of scenarios, where the ServiceWorker (or Web App) stores
    synchronised data in the IndexedDB and the Web App reads the data
    • Like everything based on the origin
    IndexedDB

    View Slide

  26. IndexedDB Availability

    View Slide

  27. • The standard API of IndexedDB is inconvenient to use (lots of callback)
    • Dexie.js is a minimalistic wrapper for IndexedDB
    • Operations are promise-based
    • Near native performance (even for bulk inserts)
    • Open Source @ GitHub: https://github.com/dfahlander/Dexie.js
    IndexedDB API

    View Slide

  28. Offline Sync

    View Slide

  29. • Offline Sync means to download all data available to client into a persistent
    offline storage, without the user having to explicitly request the data
    • Depending on the scenario, client can do CRUD on the offline data
    • Data will be synced back to the server, whenever a connection is possible
    Offline Sync Basics

    View Slide

  30. • Online/offline recognition
    • Conflict management
    • Binary data
    • Local changes
    • Deleting data
    • Update interval (incoming new data)
    • Error handling
    • Primary key generation
    Offline Sync Challenges

    View Slide

  31. • Having Edge may not mean that we are online
    • Being online could depend on several factors
    • Connection quality (Edge, 3G, 4G, 5G)
    • Connection speed (latency to your backend)
    • Reachability of backend systems (database, 3rd party system, storages)
    Offline Sync Challenges - Online/offline recognition

    View Slide

  32. • Client needs to be online for write operations
    • Locks the data, so no other client can overwrite it
    • Data stays locked, until the client either saves or discards changes
    • Last One (Write) Wins
    • Visual conflict management (diffing like in Git, SVN, etc.)
    Offline Sync Challenges - Conflict management

    View Slide

  33. • Downloading large binary data may not be possible on all platforms
    • A browser does not have a big enough storage for storing the data
    • Native apps (Cordova, Electron) could be a solution to directly access the
    platform’s file system
    • Uploading large binary data
    • Server needs to support chunk based upload with connection interruptions
    • Client needs permanent access to the binary file being uploaded, which
    could be hard in a browser-only scenario
    Offline Sync Challenges - Binary data

    View Slide

  34. • User opens an edit form and makes changes to data
    • Incoming sync from server would change the data the user is editing
    • Depending on your use-case, decide what to do
    • Inform the user about the data change?
    • Show a visual diff?
    • Override the local changed data?
    • “Do nothing” and override the server data on next sync
    Offline Sync Challenges - Local changes

    View Slide

  35. • Mark “deleted” data with a flag only
    • Use a trigger to keep the deleted ID in a separate table
    • Client needs some info about “deleted” data
    • Simple list of IDs “deleted” since the last sync
    Offline Sync Challenges - Deleting data

    View Slide

  36. • Decide when you need to sync which data
    • Not all data needs to be synced in the same interval
    • An additional real time connection (WebSocket) could be established
    • The server can send a signal when some data needs to be synchronised
    Offline Sync Challenges - Update interval

    View Slide

  37. • Show it, don’t hide them!
    • The user needs to know that something has occurred which shouldn’t
    • If you can’t sync to the client anymore, you may need to re-sync everything
    • Depending on your use case, you may want to write all data in one
    transaction on the server to not have partial data written into the database
    Offline Sync Challenges - Error handling

    View Slide

  38. Primary key generation

    View Slide

  39. • Inserted offline data needs to be available in the client immediately (e.g. the
    users reloads the application)
    • Behaves like “real synced data”, even if it was not written to server yet
    • Client is able to route to offline inserted data
    • Therefor it needs some kind of primary key/id to route to
    • Primary key can be generated on server-side and on client-side
    Primary key generation - challenge

    View Slide

  40. • Server is responsible for generating the primary key
    • Client will generate a temporary offline primary key
    • After syncing, the client needs to replace the temporary offline key with the
    server generated primary key
    • Allows to use sequential GUIDs on the server side (MSSQL)
    Primary key generation - server-side

    View Slide

  41. • Client is responsible for generating the primary key (GUIDs only)
    • Server inserts the client generated key into the database as the primary key
    • Server additionally needs to have a clustered key
    • Otherwise inserting a client non-sequential GUID will lead to a
    reclustering of the database, every time a client syncs new data
    Primary key generation - client-side

    View Slide

  42. Change Tracking

    View Slide

  43. • Possibilities to determine the client’s current state
    • Use highest rowversion within result
    • Determine by result set or additional query
    • Result of @@DBTS
    • Returns the last used rowversion of the database
    • Result of MIN_ACTIVE_ROWVERSION()
    • Returns the lowest (active) rowversion of the database
    Change Tracking - MS SQL Server’s ROWVERSION

    View Slide

  44. • If you do a million updates a second, every second, the timestamp will wrap
    around in about 585000 years
    • Could be used for concurrency checks as well (see Conflict Management)
    • Before writing data into a row, you could check if it still as the client’s
    saved row version
    • If yes: data was not modified, saving is easily possible
    • If no: data was modified, decide per use case what to do!
    Change Tracking - MS SQL Server’s ROWVERSION

    View Slide

  45. Change Tracking - Highest ROWVERSION within result
    Transaction #1
    ID ROWVERSION
    1 0x01
    2 0x02
    … …
    41 0x41
    Transaction #2
    ID ROWVERSION
    1 0x01
    2 0x02
    … …
    41 0x41
    42 0x42 43 0x43
    SELECT MAX(ROWVERSION) FROM TABLE
    Client’s state
    Last known rowversion 41
    43
    NOT RELIABLE!

    View Slide

  46. • A later started transaction but committed first results in a queryable
    rowversion higher than the pending ones
    • Data will be lost (never seen by the client’s delta request)
    Change Tracking - Highest ROWVERSION within result

    View Slide

  47. Change Tracking - Query @@DBTS
    Transaction #1
    ID ROWVERSION
    1 0x01
    2 0x02
    … …
    41 0x41
    Transaction #2
    ID ROWVERSION
    1 0x01
    2 0x02
    … …
    41 0x41
    42 0x42 43 0x43
    SELECT @@DBTS
    Client’s state
    Last known rowversion 41
    42
    43
    NOT RELIABLE!

    View Slide

  48. • Value is database global
    • Running transaction updates the value instantly (before commit or rollback)
    • Data will be lost (never seen by the client’s delta request)
    Change Tracking - Query @@DBTS

    View Slide

  49. Change Tracking - Query MIN_ACTIVE_ROWVERSION()
    Transaction #1
    ID ROWVERSION
    1 0x01
    2 0x02
    … …
    41 0x41
    Transaction #2
    ID ROWVERSION
    1 0x01
    2 0x02
    … …
    41 0x41
    42 0x42 43 0x43
    SELECT MIN_ACTIVE_ROWVERSION()
    Client’s state
    Last known rowversion 41
    42
    44
    RELIABLE!

    View Slide

  50. • Value is database global
    • Running transaction updates the value instantly (before commit or rollback)
    • Data will be safe
    • Needs to be compared with greater or equal
    • May query data multiple times
    Change Tracking - Query MIN_ACTIVE_ROWVERSION()

    View Slide

  51. Backend Preparation

    View Slide

  52. • All syncable entities need to have rowversion column
    • rowversion is updated by MS SQL Server automatically whenever the row
    is changed (created & updated)
    • For deleted entities
    • Either set a IsDeleted flag to true (never delete any rows physically)
    • Or save the deleted IDs of the entities somewhere else (by trigger)
    Backend Preparation - MS SQL Server

    View Slide

  53. • Use equivalents of rowversion and triggers
    • Manual implement mechanism in business logic (error-prone!)
    • Update tracking column manually by incrementing a database global
    number (during one transaction!)
    • Will be very hard für multi-row updates/inserts
    • Manual implement mechanism in triggers (if available)
    Backend Preparation - Other Database Systems

    View Slide

  54. Frontend Preparation

    View Slide

  55. • Choose storage area for data (e.g. IndexedDB)
    • Write all the code
    • Periodic data synchronization
    • Binary synchronization when data changes
    • Tracking of timestamps
    Frontend Preparation

    View Slide

  56. Security

    View Slide

  57. Typical token-based security
    Security - API level
    Browser
    Identity
    Provider
    Web API

    View Slide

  58. • User is only able to see data based on his security level
    • Permissions
    • Roles
    • Policies
    • What happens, if the user permission change is leading to different data
    visible to the user?
    • What about data, which he does not see due to rights, but is connected to
    other data?
    • Remove frontend data after logout
    Security - Data Level

    View Slide

  59. Pitfalls

    View Slide

  60. • De-normalize the relational data (document style)
    • Results in multiple rowversion for one entry (use most recent one)
    • Multiplies the data (more traffic)
    • Consistent data for one entry
    • Keep the relations up to the frontend
    • Explicit rowversion for each entry
    • Partial consistency (related data may not be synced yet)
    Relational Data

    View Slide

  61. • Keep some relational data and de-normalize some of it
    • De-normalize many-to-many relations
    • Needs trigger or business logic to change main entry’s rowversion when
    relation changes
    • Partial consistency (related data may not be synced yet)
    Relational Data

    View Slide

  62. • PWA helps taking offline the application, but not the data
    • Online != Edge is available
    • Offline data may be “temporary” (“Remove temporary internet files”)
    • Use MIN_ACTIVE_ROWVERSION() and “greater or equal than” operator
    • Think about data level security (permissions, roles, etc.)
    • Think about conflict management
    • Depends heavily on your use case
    Summary

    View Slide

  63. And… now?
    • Slides: https://speakerdeck.com/manuelrauber
    • Repository: https://github.com/thinktecture/thinktecture-boardist

    View Slide