Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data Layer Philosophy

Matthew Bentley
August 22, 2024
120

Data Layer Philosophy

Ever wondered what a data layer is? This deck (which has been printed to PDF to include notes) - delivered at Brighton SEO, MeasureFest in October 2024 - talks about the underpinning business drivers for creating a data layer that are absolutely vital for deciding architecture, design and maintenance of the data layer: what should it look like, what should it contain, how should it be built and who should own and maintain it.

These key decisions, which need to be made upfront before any development starts, are often missed during data layer projects, which make the mistake of treating the data layer as a purely technical exercise. This leads to failure in realising the full benefit from the data layer; leading to poor data, technical debt / inefficiency, broken processes, overworked and disgruntled teams and a lack of trust in the data; ultimately resulting in increased cost and lost revenue opportunities.

Use this deck as a starting point to avoid these pitfalls. I hope it helps!

Matthew Bentley

August 22, 2024
Tweet

Transcript

  1. Data Layer Philosophy Matt Bentley @mattdoesdata @Matts_At_Work Speakerdeck.com/mattsatwork24 1. Here

    today because I’m passionate about helping companies to deliver value from their first party data. 2. You can do that if you’ve got consistent + quality data across your whole estate – this lets you join data together and find those nuggets and supercharge your marketing, make your customer journeys seamless, fix that bug that causing users to drop out! 3. You get consistent quality digital data by deploying a DATA LAYER!! 4. In my 15 years working in digital data, I’ve been lucky enough to design data architecture and data layer schemas for some of the biggest companies in the world You’re gonna learn: • WHAT IS A DATA LAYER & DO I NEED ONE • HOW TO APPROACH A DATA LAYER PROJECT • WHO SHOULD BUILD AND MAINTAIN THE DATA LAYER • HOW TO MAKE YOUR DATA LAYER PROJECT A SUCCESS
  2. Not to say these definitions are wrong, but… But they

    don’t really explain: what is a data layer for?... • JS object = output of DL, (and only one possible output - poor old native mobile apps / IoT?) • Surely the schema definition (what DL contains; how it works) = more interesting By only describing the end-point / output of the data layer process: • Ignore the core principles that underpin the architecture • Why the data layer is how it is (what needs does it meet) • How does it need to be built (which feeds into who is to build it, in turn cycling back to what (or whose) needs does it meet) Car analogy: STARTING POINT: it’s got to be green, sporty, in-car entertainment… oops we forgot the wheels! 3
  3. So do you need one? If you have lots of

    data transformations across lots of platforms that need to deliver information in a consistent way, then yes: If you have logged in area, products to display, a checkout etc – and there might be multiple tech e.g. storefront on different platform to checkout; app vs web… then you need a DL to deliver that data consistently… otherwise, you’re not going to be able to join that data together and track performance effectively:  Multi-billion electrical retailer – fully featured mega data layer pulling that from multiple backend systems that has to be linked together to tell the story of the customer journey  Sister company (20 mil) – ecommerce only because their storefront and checkout were on different platforms  100 billion food / bev company w 1000s sites – no data layer BUT looking into one for consistent promotion tracking to understand marketing performance cross site, brand and region 4
  4. DL projects fail because they treat it as a technical

    exercise rather than focusing on what it's meant to deliver • Data consistency (cross platform) • Data quality It does this through • Simplification / efficiency (i.e. easy to deploy without cocking it up) It does this to deliver REVENUE & COST SAVINGS 5
  5. BASIC LEVEL - human interaction is simple • Input (I

    say something to you) • Decision (you think and decide on a response) • Output (you respond) 6
  6. DEVICE INTERFACES REFLECT THIS – TAP, DECIDE WHAT TO SHOW

    (OFTEN SIMPLE LOGIC, OFTEN OFF-DEVICE), PROVIDE A VIEW DATA LAYERS ONLY REALLY NEED TO REFLECT 2 OF THESE – INPUT (tap) + OUTPUT (view) EVERYTHING ELSE IS JUST CONTEXT! We run the data layer for that complex electrical retailer OFF THESE 2 EVENTS… lots of context to be sure, but at it’s core… * Cards on the table: + 3 business level events (e.g. describing products, orders, baskets) Example: the checkout vs bespoke tag example – when you use the data, layer, if you do it right, it JUST WORKS 7
  7. CONVERSATION between the customer and a business… IN A STANDARDISED

    WAY If that project / feature you’re building is intended to run on any kind of digital device, it MUST use that device’s interface: SOME COMBINATION OF INPUT / OUTUT + CONTEXTUAL DATA • That chat widget doesn’t need special “chat” events • Logged in account-level journeys don’t need special account-level events STANDARDISE AND SIMPLIFY PHILOSOPHY feeds into the next level of architectural consideration… 8
  8. THE DATA LAYER IS A RESOURCING SOLUTION Built in a

    way to get the right work to right people (STRENGTHS / WEAKNESSES) Example problem: 4 tech x 100 data points x 20 mapping x 4 consent groups = 32,000 transformations) THESE TRANSFORMATONS WILL STILL EXIST EVEN WITH A DATA LAYER… so who should do them? 9
  9. Business stakeholders Implementation iOS Android CMS React Implementation team (Historically:

    understand data collection, architecture, tag management, analytics et al): 1. LOW RESOURCE – they can’t do your 32,000 transformation in the TMS 2. BUT - horizontal matrix covering many teams (good!!) 3. THREFORE: RESPONSIBLE: Translate business recs to define and maintain data layer schema 4. BUT: Lack of understanding of dev teams (strengths / weaknesses) Development team(s) (Historically: Build and maintain multiple platforms (web, app, IoT etc) - wide range of responsibilities (data may not be one): 1. HIGH RESOURCE – they CAN do your 32,000 transformations 2. BUT - vertical matrix (differing product owners + responsibilities); lots of individuals; mixed skill sets and skill levels 10
  10. 3. Has skillsets to standardise / automate (good!!) 4. BUT:

    not data experts; this will NOT change (unreasonable to expect it to – other responsibilities, high turn-over due to volume etc) 5. THEREFORE: RESPONSIBLE: Build and maintain the data layer as part of maintaining the platform, ensuring it meets defined standards THEREFORE DATA LAYER HAS TO BE DESIGNED IN A WAY THAT IS EASY FOR DEVS TO BUILD AND MAINTAIN CONSISTENTLY Path of least resistance – MAKE IMPLEMENTATION TEAM UNDERSTAND DEV PROCESSES (only 1-2 people instead of 100s / 1000s) SIMPLE = CONSISTENT; CONSISTENT = CROSS PLATFORM + DATA QUALITY 10
  11. 12 1. Remove anything bespoke (human error) – standardisation allows

    for automation which in turn drives consistency and quality; your project is not special or unique if it is going to be delivered via any kind of digital interface! It doesn’t require unique events 2. Rigorous acceptance criteria for anything new to be added to the data layer 12
  12. 13 1. Keep things as simple as possible - again,

    simple is easy to automate, which drives consistency and quality 2. Make the data layer event-driven (platform agnostic again – not tied to browser event for a browser that may not exist; e.g. native app) 3. Tie data objects to events (rigidly controlled) – the same event ALWAYS has the same schema (added advantage; easier to QA too!) 4. Make it centralised and delivered from the serve = consistent cross platform – as little on-platform re-factoring as possible 13
  13. 14 { "$ref": "#/definitions/ProductEvents", "$schema": "http://json-schema.org/schema#", "definitions": { "ProductEvents": {

    "additionalProperties": false, "description": "Indicates that an event has occurred", { "$ref": "#/definitions/ProductEvents", "$schema": "http://json-schema.org/schema#", "definitions": { "ProductEvents": { "additionalProperties": false, "description": "Indicates that an event has occurred", 1. Make it modular (same principles as modern development; a suite of centralised, re-usable data components, aggregated into the overarching data layer AUTOMATICALLY in a centrally managed way = automatic and globally consistent) 2. (CLICK) Bake the modules into underpinning platform components (hard upfront but makes it automated) - e.g. this content block can be re-used anywhere on platform; the same data object will appear wherever the component appears AND the same mechanism of processing that object into the overall data layer (because is centralised) 14
  14. 15 1. Make the data layer granular: 1. Removes tech

    debt form devs (e.g. if the one who did the processing for you leaves, what do you do?) 2. Allows more freedom to format the data how you want and manage that within the implementation team, who also own the schema 15
  15. 16 1. Automate validation – get that schema out of

    excel and put it to work! 1. Example here is a JSON schema 2. There are publicly available validation scripts 3. It will return errors for your devs, allowing them to validate their own work without the low-resource implementation team being a bottle neck 16
  16. Data Layer Philosophy Matt Bentley @mattdoesdata @Matts_At_Work Speakerdeck.com/mattsatwork24 • WHAT

    IS A DATA LAYER & DO I NEED ONE • HOW TO APPROACH A DATA LAYER PROJECT • WHO SHOULD BUILD AND MAINTAIN THE DATA LAYER • HOW TO MAKE YOUR DATA LAYER PROJECT A SUCCESS Here’s my X handle Here’s my linkedin profile And here’s where you can find the deck