Cloud-Native Generative AI mit Fermyon Serverless AI

by Thorsten Hans

Slide 1

Slide 1 text

Cloud-Native Thorsten Hans @ThorstenHans Generative AI mit Fermyon Serverless AI

Slide 2

Slide 2 text

Cloud-Native Business Applications Day Uhrzeit Titel Sprecher 09:00 – 10:00 Cloud-Native-all-the-Things: Definition, Praktiken und Patterns Christian Weyer Thorsten Hans 10:30 – 11:30 Containerbasierte Entwicklung für .NET-Entwickler Tobias Fenster 12:00 – 13:00 Cloud-Native Generative AI mit Fermyon Serverless AI Thorsten Hans 15:15 – 16:15 Cloud-Native Microservices, on-premises oder in der Cloud – mit Dapr Christian Weyer 16:45 – 17:45 Mega Mergers: Cloud-Native-Architekturen mit Containern und WebAssembly Thorsten Hans

Slide 3

Slide 3 text

Consultant @ Thinktecture #Azure #Containers #CloudNative #Wasm [email protected] thinktecture.com thorsten-hans.com @ThorstenHans Microsoft MVP | Docker Captain Thorsten Hans

Slide 4

Slide 4 text

• Intro • What is Fermyon Spin • Serverless AI with Fermyon Cloud • Conclusion Agenda

Slide 5

Slide 5 text

• Intro • What is Fermyon Spin • Serverless AI with Fermyon Cloud • Conclusion Agenda

Slide 6

Slide 6 text

WebAssembly will change the way we architecture distributed systems in the future

Slide 7

Slide 7 text

Why?

Slide 8

Slide 8 text

• Cloud-vendor interest: • They can put more apps on a compute resource as today • Wasm and WASI give them a strict security and isolation model • Wasm workloads are way smaller than everything else • They can scale to zero due to super-fast bootstrapping < 1msec Intro Why will Wasm have such a big impact?

Slide 9

Slide 9 text

• Developer interest: • They can use any language that compiles to wasm32_wasi • They can ship just the app • They can reduce cloud spendings • Same workloads will be cheaper because they consume less resources and execute faster Intro Why will Wasm have such a big impact?

Slide 10

Slide 10 text

Wasm on the server relates to containers in the same way containers related to virtual machines 10+ years ago

Slide 11

Slide 11 text

• Intro • What is Fermyon Spin • Serverless AI with Fermyon Cloud • Conclusion Agenda

Slide 12

Slide 12 text

• Intro • What is Fermyon Spin • Serverless AI with Fermyon Cloud • Conclusion Agenda

Slide 13

Slide 13 text

Fermyon Spin is: • A serverless runtime build using Wasm, WASI, and the WebAssembly Component Model (leveraging wasmtime internally) • A collection of SDKs for many popular languages • A super focussed developer tooling Intro Let’s get everybody on track! 🦀

Slide 14

Slide 14 text

Dive into Fermyon Spin Demo

Slide 15

Slide 15 text

• Intro • What is Fermyon Spin • Serverless AI with Fermyon Cloud • Conclusion Agenda

Slide 16

Slide 16 text

• Intro • What is Fermyon Spin • Serverless AI with Fermyon Cloud • Conclusion Agenda

Slide 17

Slide 17 text

Give application developers sophisticated generative AI capabilities with no-ops and maintain developer productivity Serverless AI with Fermyon Cloud

Slide 18

Slide 18 text

• Fermyon Serverless AI empowers developers to use AI inferencing in their apps with no additional setup • Encapsulate inferencing capabilities into two methods • Local developer story (independent from your hardware / operating system) Serverless AI with Fermyon Cloud

Slide 19

Slide 19 text

• Frictionless AI offering by Fermyon • Execute inferencing against LLMs (currently Llama 2 and CodeLlama with 13b parameter variants) with no-ops • Generate sentence embeddings (all-minilm-l6-v2) for your data using a no-ops vector database • Support for additional models coming soon Serverless AI with Fermyon Cloud

Slide 20

Slide 20 text

Hello Serverless AI Demo

Slide 21

Slide 21 text

Deploy the Spin application to Fermyon Cloud for speeding it up 🚀 and use GPU powers Demo

Slide 22

Slide 22 text

• Intro • What is Fermyon Spin • Serverless AI with Fermyon Cloud • Conclusion Agenda

Slide 23

Slide 23 text

• Intro • What is Fermyon Spin • Serverless AI with Fermyon Cloud • Conclusion Agenda

Slide 24

Slide 24 text

• With Spin, Fermyon demonstrates how WebAssembly will change the way we build software for the next wave of cloud-computing • Serverless AI allows application developers to add generative AI capabilities to their apps in “no-time” • Although we have access to Llama2 and CodeLlama now, we will see more models in Fermyon Cloud soon Conclusion

Slide 25

Slide 25 text

Thanks for your attention @ThorstenHans @Thinktecture