In this talk, we will introduce Hadron, a project that involves building a new platform for die Mobiliar to run all BI, ML, Analytics, and AI workloads. The platform is built on the Lakehouse architecture from Databricks and hosted on Azure. We use multiple Databricks Workspaces and UnityCatalog from Databricks. Our project needs to adhere to the pre-defined cloud architecture for application deployment on Azure within our company. The data on the platform is organized using the DataMesh/DataProduct approach. We began the project in February 2022 and have made significant progress in building the infrastructure, allowing for large-scale data products to be deployed on the platform.
Our talk will cover the following topics:
Introduction: We’ll discuss our goals for the new platform and why we chose the Databricks Lakehouse approach.
The key components of the Lakehouse architecture: Unity Catalog, Databricks Workspaces, and how access control is managed.
Overview of the cloud architecture within our company and how it informs the infrastructure of the platform.
Infrastructure overview: how we organize Databricks Workspaces, Azure Data Lake Storages, and how the storage is networked with the workspaces to ensure secure access.
Automation: We’ll explain how we use GitLab Pipelines and Terraform, as well as Databricks APIs, to automate the deployment of resources and manage access controls.
Key decisions made during the architecture design and why they were important.
Challenges we encountered during the project and how we overcame them.
Current status of the infrastructure: what data is available and who is currently using the platform.
Next steps: where we plan to go from here.
🙂 HANSJÖRG WINGEIER ⚡️ IT Architect @ Die Mobiliar
🙂 MATHIAS HERZOG ⚡️ Cloud Consultant @ peakscale.ch