Slide 1

Slide 1 text

Grip on Enza Zaden's decentralized data platform Bart Rentenaar, Cor Zuurmond Big Data Expo September 12th 2023

Slide 2

Slide 2 text

Enza Zaden ❑ Enza Zaden is one of the world's leading vegetable breeding companies. ❑ We develop vegetable varieties in more than 30 international and local crops. ❑ Its seeds are produced and sold all over the world. ❑ Our total range contains around 1,200 vegetable varieties. Breeding to feed the world

Slide 3

Slide 3 text

Enza Zaden ❑ #6 seed breeding company ❑ Independent family business (3rd generation) ❑ Founded 85 years ago ❑ 2.500+ employees worldwide ❑ Situated in 25 countries ❑ Turnover € 400+ mln. ❑ Annually € 100+ mln. invested in R&D Facts & Figures 480 million people eat our vegetables daily, worldwide “

Slide 4

Slide 4 text

Enza’s data strategy focuses on four related data programs DATA MANAGEMENT DATA ENGINEERING DATA MONITIZATION DATA SAVVINESS FOUNDATION VALUE FIT-FOR-PURPOSE DATA SCALABLE, ROBUST AND COMPLIANT DATA PLATFORM ACCELERATE PERFORMANCE AND CONTRIBUTE TO ENZA STRATEGY DATA DRIVEN MINDSET data strategy

Slide 5

Slide 5 text

How to bring data & analytics as close to the business as possible, in a well-governed setup? Governance Architecture Data Platform Data Management Security People How we need to (re)organize ourselves What our technology landscape should look like Which Analytics tools & services we will use for which purposes How we manage high quality and trusted data How we protect our crown jewels How we grow and nurture a data-savvy culture data platform

Slide 6

Slide 6 text

Breeding generates a lot of data. Mastering this data is key. data platform PHENO TYPE ENVIRONMENTAL GENOTYPE

Slide 7

Slide 7 text

There’s a balance between uniform way of working vs. self service Well-gove rned… … mandate to operate data platform

Slide 8

Slide 8 text

“Well-governed mandate to operate” improves and simplifies execution by autonomous, decentral DevOps teams ✔ Decentral data handling and analytics ✔ Autonomy through self-service tooling ✔ Business / use case driven ✔ Overarching governance ✔ Centrally facilitated services ✔ Clear guidelines ✔ Secure ✔ Transparent Well-governed … …Mandate to operate data platform How we need to (re)organize ourselves Central data hub Business Function

Slide 9

Slide 9 text

Overarching governance: Fit within organization SMART DATA PLATFORM Data, Analytics & Reporting SO Data, Analytics & Reporting M&S Data, Analytics & Reporting Staff Digital Phenotyping & Climate Sensors Bioinformatics Systems Biology & Breeding AI Breeding Analytics Seed AI Artificial Intelligence Business Intelligence One data platform supporting all data and analytics use cases. The platform design is approved by an enterprise architect and IT to ensure fit within the whole organization. The platform is implemented by a central team ensuring correct implementation of approved design decision. well-governed…

Slide 10

Slide 10 text

Overarching governance: central data management Overarching governance is enforced in the central data hub by: • Controlling which data contracts are accepted • Granting who can write and read • Cataloguing data • Profiling data • Verifying data quality Central data hub Out In well-governed…

Slide 11

Slide 11 text

Centrally facilitated services: hub-spoke Local database Landing hub (Global) Databas e (Global) Databas e Inges t spoke Central Data hub PUS H SDP workspace Analytic s spoke Report / dashboard Data Science spoke Transformatio n spoke PULL DAR workspace (Global) Databas e PULL API AI Model PULL Infrastructure and application code is both managed by a central team. Infrastructure is managed by a central team and application code by a business function. well-governed…

Slide 12

Slide 12 text

Infrastructure as code templates Centrally facilitated Services: Templates and Self-service The SDP team provides clear services and guidance to the spokes on which services to choose and how to work with data & analytics. Service 1 Service 2 Service 3 Service 4 Etc. Self-service Spoke 1 Spoke 2 Etc. Cloud infrastructure Write code Deploy infra Request infra SDP team well-governed…

Slide 13

Slide 13 text

Clear guidelines: competence center well-governed… Documentation Training and community of practice Assistance and office hours Shared common dimensions and datasets

Slide 14

Slide 14 text

Clear guidelines: layered approach We apply a layered approach to guidelines. The following levels are defined: • A policy is enforced, you must comply. The SDP team enforces the policies. For example, security policies like network routing and user access management. • A standard should be complied with or explained why deviated from. The SDP team provides standards by default, but business functions may opt-out. They take responsibility when they deviate from standards. For example, default cloud services are available while a business function may choose another service. • A best practice is recommend and optional to comply with. The SDP team leads by example by always applying a best practice to their services. For example, use Azure Data Factory as data integration service of structured and semi-structured data sources. Policy Standard Best practice Move a level up if proven to be generic and useful well-governed…

Slide 15

Slide 15 text

Security is centrally enforced by the data platform team. well-governed… Networking: All network access is centrally controlled on a cloud landing zone Auditing changes: Changes are signed and timestamped through git. User access management: Least privileges are permitted to groups. Only read access on production resources (not on data). Secure: a layered model Privileged identity management: Approve and audit temporarily access to production resources. Data access managed: Manage who can access which data.

Slide 16

Slide 16 text

Transparant Everything as code and reader access to everything well-governed…

Slide 17

Slide 17 text

Decentral data handling and analytics mandate to operate Applications Access to more services Reporting Increased freedom to operate A more mature team makes more advanced applications that require more advanced tooling. In the most extreme case, a team takes ownership of the infrastructure. The application team can use any required service with the responsibility of getting approval from the enterprise architect and IT. Business intelligence Data integration Advanced analytics

Slide 18

Slide 18 text

Business and use case driven Vision on how to work with data: Bring data & analytics as close to the business as possible, in a well-governed setup mandate to operate Jonathan Dijkslag Global Manager Data Insights & Data Innovation Enza Zaden

Slide 19

Slide 19 text

Appendices

Slide 20

Slide 20 text

How do we balance freedom to operate and overall efficiency? keep the balance A decentralized data platform gives data teams the freedom to offer data products to their data consumers. A downside to a decentralized approach is potential chaos and proliferation of technologies. How does Enza ensure that all data teams do not follow their own path too much and thus become less effective as a company? • What preconditions do we want to enforce to increase the uniform way of working while at the same time retaining flexibility for the data teams? • What measures do we take to guarantee the quality of the service? • How can we effectively manage data ownership, data access controls and security across the different data domains? • How do we deal with differences in data maturity within and between the data teams?

Slide 21

Slide 21 text

Our toolset (BI) DATA MANAGEMENT STORAGE EXTRACT & LOAD TRANSFORM REPORT DEVELOPMENT INFRASTRUCTURE data platform

Slide 22

Slide 22 text

Maturity levels data platform REPORTING TRANSFORMATION (ADVANCED) INGEST DATA SCIENCE TRANSFORMATION (SIMPLE) INGES T TRANSFORMATIO N (ADVANCED) TRANSFORMATIO N (SIMPLE) REPORTING DATA SCIENCE Anakytics spoke Data Science spoke

Slide 23

Slide 23 text

DATA PRODUCTS REPORT ANALYSES AI MODEL DATA QUALITY MEASUREMENTS PROCESS MONITORING DATASET DATA CONSUMERS EXAMPLES: • DATA ANALISTS • DATA SCIENTISTS • MANAGEMENT • FINANCIAL CONTROLLERS • BIO INFORMATION TECHNICIANS • BREEDERS • LABORANTS • MARKETEERS • SALES MANAGERSS data platform