Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ABCS25: Building and Operating a Productive AI ...

ABCS25: Building and Operating a Productive AI Application for Swiss Government on Azure by Tobias Kluge

Description
What does it take to bring a real-world AI application into production for a public sector client with strict compliance, complex stakeholder landscapes, and evolving requirements?

In this session, we’ll take you behind the scenes of delivering and operating a production-grade AI solution for a Swiss government department. From the first line of code to stable operations in Azure, you'll learn how we built and shipped a secure, maintainable, and scalable application using Terraform, Azure Landing Zones, and modern DevOps practices.

We’ll dive into:

- Architecting and deploying AI workloads on Azure with reproducibility and compliance in mind
- Handling data management, access control, and incident response in regulated environments
- Navigating stakeholder complexity through iterative workshops and rapid feedback loops
- Real-world failures, security incidents, and how we improved resilience and observability
- Lessons learned in managing environments, pipelines, and infrastructure using Infrastructure as Code (IaC)

This talk is ideal for cloud architects, developers, and IT pros who want to learn from practical experience delivering AI systems in high-trust environments.

Session Owner Speaker

Tobias Kluge
Mr. AI, Nexplore AG

Tobias is a seasoned developer, software architect and AI engineer with 20 years of experience. Currently, he serves as "Mr. AI" at Nexplore AG, where he is responsible for integrating AI technologies to boost performance and efficiency. His role also involves developing AI-related offerings for the Swiss market and managing customer projects with a results-driven approach.
[email protected]

linkedin.com/in/tobiaskluge/
tobiaskluge.com (blog)
incratec.com (company)

sessionize.com/tobiaskluge (public speaker profile)

Tweet

More Decks by Azure Zurich User Group

Other Decks in Technology

Transcript

  1. • AI & Cloud Strategist & Solution Consultant • Co-Organizer

    of AI Meetup Bern, Leader of the «Cloud» interest group of the Digital Impact Network Bern • Lead the Go-To-Market for AI offerings at Nexplore AG & managed AI projects since 2023 • Now running my own company incratec (again) • Solution Architect for Custom AI Projects and Azure Cloud • Implementing AI for SMB from A to Z • Personal missing: support and grow the local AI community in Bern About Tobias Kluge
  2. Application ML & AI model Data Infrastructure Project & Admin

    Kick-Off Concept & PoC MVP Go-Live 1.0 Optimize Operations
  3. 6 User Problem Manuals Check Lists Release Notes Support Application

    Data Check Lists Application Information Application HELP!
  4. 9 Agile & regular workshops: training, feedback, alignment & «selling»

    to our stakeholder How are we training the data Is the data sent to the US? Why not model X? How secure is this? ChatGPT is much better… Finetuning – what is this? Why is the answer missing? The answer must be more polite. We must make sure that our data is not stored and used for training.
  5. 12 AI Landing Zone (multi-application) Architecture Best Practices for Azure

    OpenAI Service (WAF) | Baseline OpenAI End-to-End Chat Reference Architecture - Azure Architecture Center aka.ms/ai-gateway | AI gateway capabilities in Azure API Management
  6. 14 Azure OpenAI & AI Foundry: PayGo GPT-4o in CH

    Data, privacy, and security for Azure OpenAI Service Abuse monitoring Content filtering Red-teaming Responsible AI PayGo vs PTU Pricing Deployment: Global vs Regional Terraform Module
  7. 15 AI & Data – development & evaluation process Data

    * AI model Application Solution Development process Evaluation Expert questions & answers * User feedback * Performance & quality Anpassung Evaluieren Release Stellschraube Adjustment Evaluation Release Feature 1 Adjustment Evaluation Release Feature 2 Adjustment Evaluation Release Feature n * Requires domain experts
  8. 16 Our Metrics (Phase 1) - What to measure? -

    Finding the correct data vs. writing the proper answer for the user question - Metrics e.g. - Faithfulness - Answer Relevancy - Semantic Similarity - Factual Correctness Used libraries AI & Data – measuring quality Our Data Sets • AI-Engineer questions/answers • Expert question/answers • Support department data • Generated question/answers with GenAI • User questions, expert reviewed answers • Multi-language questions (DE, FR, IT) https://docs.ragas.io/en/latest/concepts/metrics/overview/
  9. 17 Data: wanted – so use GenAI GenAI Product requirements

    & specification Additional content (e.g. release notes)
  10. Ops

  11. Operation Release Version 19 Application Software Requirements Code Monitor Sec

    19 ML & AI model Model selection LLM Trad ML Evaluate. optimize & (pre-) train «custom» ai model Lifecycle FinOps Model drift Data Exploration & validation Cleaning Training data Eval data Gold data Knowledge data Data drift User Behavior Infrastructure Infrastructure Architecture IaC FinOps SecOps IaC Modules Project & Admin Abnahme Schuban ISDS Abruf Kick-Off Concept MVP Go-Live 1.0 Optimize Operations
  12. 21 Summary START SMALL, MEASURE AND CONTINUOUSLY IMPROVE THE PROCESS

    AND SYSTEM EDUCATE THE CUSTOMER, TEAM AND COMPANY AI IS NOT (YET) THE ANSWER