Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bol.com

Marketing OGZ
September 15, 2023
120

 Bol.com

Marketing OGZ

September 15, 2023
Tweet

Transcript

  1. Big Data Expo 2023 How to accelerate your analytics team

    with analytics engineering Zhou Su Analytic s Enginee r zhou.su @xebia. com Loek Botman Analytic s Enginee r
  2. / Logistic Services Analytics • Logistic Services: warehousing & delivery

    services to sellers on our platform • Analytics: business decisions backed by data
  3. ‘Transform’ bigger than you think • Data modeling • Data

    testing • Source freshness testing • Data timeliness • Data documentation & definitions • Implementing right to be forgotten • Code reusability • Code linting • Data accessibility • Development vs production environments • …
  4. Data engineer Supplies data to the data warehouse Tech-focused Strong

    programming component Analytics Engineer Data Analyst Brings engineering principles to the field of analytics Business- & Tech-focused Data warehousing SQL fluency Gets insights from the data Business-focused Communication & stakeholder management skills Data visualization
  5. Centralized ETL All transformations in one project Testing testing testing!

    Testing has become part of our definition of done Documentation in GIT/BQ Documentation is part of our definition of done and is exposed CI/CD Easy to develop and automate tests before going to production Atscale cubes, BQ scripts, etc over separate locations and tools Scattered ETL Testing in scripts was tedious and mostly manual. Hardly any testing Lack of documentation Both on our data products, as well as conventions, pipelines and stack Development Even with great care; hard to control progress and roll back if necessary 01 02 03 04 01 02 03 04 Before and after Q4 2020 In Q4 2020 we started rethinking our analytics workflow
  6. We have a development environment next to our production environment

    lsa-dev lsa DEV PRO <username> logistic_services project project dataset dataset logistic_services dataset dbt run dbt run dbt run
  7. Data warehouse architecture Source s Bases -recasted -renamed Marts -entities

    Commo n -enriched entity Exposu res -dashboard s -authorized views One big table -for analysis
  8. Some practical tips • Start with the technical stuff •

    Continue with more focus on data modeling and conventions • Think about centralization vs. decentralization