Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cloud-Native Generative AI mit Fermyon Serverless AI

Cloud-Native Generative AI mit Fermyon Serverless AI

Dank generativer künstlicher Intelligenz (Gen AI) können wir Anwendungen intelligenter machen. Aber wie integriere ich KI überhaupt in eine Cloud-Native-Anwendung? Welche Möglichkeiten gibt es? Und warum ist das junge Tech-Startup Fermyon mit Serverless AI gerade so gehyped? Diese und weitere Fragen beantwortet Thorsten Hans in seiner Session und zeigt, wie Sie mit Fermyon Serverless AI und Llama2 einfache Szenarien im Handumdrehen abbilden können.

Thorsten Hans

September 28, 2023
Tweet

More Decks by Thorsten Hans

Other Decks in Technology

Transcript

  1. Cloud-Native
    Thorsten Hans
    @ThorstenHans
    Generative AI mit Fermyon Serverless AI

    View full-size slide

  2. Cloud-Native Business Applications Day
    Uhrzeit Titel Sprecher
    09:00 – 10:00 Cloud-Native-all-the-Things: Definition, Praktiken und Patterns Christian Weyer
    Thorsten Hans
    10:30 – 11:30 Containerbasierte Entwicklung für .NET-Entwickler Tobias Fenster
    12:00 – 13:00 Cloud-Native Generative AI mit Fermyon Serverless AI Thorsten Hans
    15:15 – 16:15 Cloud-Native Microservices, on-premises oder in der Cloud – mit Dapr Christian Weyer
    16:45 – 17:45 Mega Mergers: Cloud-Native-Architekturen mit Containern und WebAssembly Thorsten Hans

    View full-size slide

  3. Consultant @ Thinktecture
    #Azure #Containers
    #CloudNative #Wasm
    [email protected]
    thinktecture.com
    thorsten-hans.com
    @ThorstenHans
    Microsoft MVP | Docker Captain
    Thorsten Hans

    View full-size slide

  4. • Intro
    • What is Fermyon Spin
    • Serverless AI with Fermyon Cloud
    • Conclusion
    Agenda

    View full-size slide

  5. • Intro
    • What is Fermyon Spin
    • Serverless AI with Fermyon Cloud
    • Conclusion
    Agenda

    View full-size slide

  6. WebAssembly will change the way we architecture
    distributed systems in the future

    View full-size slide

  7. • Cloud-vendor interest:
    • They can put more apps on a compute resource as today
    • Wasm and WASI give them a strict security and isolation model
    • Wasm workloads are way smaller than everything else
    • They can scale to zero due to super-fast bootstrapping < 1msec
    Intro
    Why will Wasm have such a big impact?

    View full-size slide

  8. • Developer interest:
    • They can use any language that compiles to wasm32_wasi
    • They can ship just the app
    • They can reduce cloud spendings
    • Same workloads will be cheaper because they consume less
    resources and execute faster
    Intro
    Why will Wasm have such a big impact?

    View full-size slide

  9. Wasm on the server relates to containers in the same
    way containers related to
    virtual machines 10+ years ago

    View full-size slide

  10. • Intro
    • What is Fermyon Spin
    • Serverless AI with Fermyon Cloud
    • Conclusion
    Agenda

    View full-size slide

  11. • Intro
    • What is Fermyon Spin
    • Serverless AI with Fermyon Cloud
    • Conclusion
    Agenda

    View full-size slide

  12. Fermyon Spin is:
    • A serverless runtime build using Wasm, WASI, and the
    WebAssembly Component Model (leveraging wasmtime internally)
    • A collection of SDKs for many popular languages
    • A super focussed developer tooling
    Intro
    Let’s get everybody on track!
    🦀

    View full-size slide

  13. Dive into Fermyon Spin
    Demo

    View full-size slide

  14. • Intro
    • What is Fermyon Spin
    • Serverless AI with Fermyon Cloud
    • Conclusion
    Agenda

    View full-size slide

  15. • Intro
    • What is Fermyon Spin
    • Serverless AI with Fermyon Cloud
    • Conclusion
    Agenda

    View full-size slide

  16. Give application developers sophisticated
    generative AI capabilities
    with no-ops and maintain developer productivity
    Serverless AI with Fermyon Cloud

    View full-size slide

  17. • Fermyon Serverless AI empowers developers to use AI inferencing in
    their apps with no additional setup
    • Encapsulate inferencing capabilities into two methods
    • Local developer story (independent from your hardware / operating
    system)
    Serverless AI with Fermyon Cloud

    View full-size slide

  18. • Frictionless AI offering by Fermyon
    • Execute inferencing against LLMs (currently Llama 2 and CodeLlama
    with 13b parameter variants) with no-ops
    • Generate sentence embeddings (all-minilm-l6-v2) for your data using
    a no-ops vector database
    • Support for additional models coming soon
    Serverless AI with Fermyon Cloud

    View full-size slide

  19. Hello Serverless AI
    Demo

    View full-size slide

  20. Deploy the Spin application to Fermyon Cloud
    for speeding it up 🚀 and use GPU powers
    Demo

    View full-size slide

  21. • Intro
    • What is Fermyon Spin
    • Serverless AI with Fermyon Cloud
    • Conclusion
    Agenda

    View full-size slide

  22. • Intro
    • What is Fermyon Spin
    • Serverless AI with Fermyon Cloud
    • Conclusion
    Agenda

    View full-size slide

  23. • With Spin, Fermyon demonstrates how WebAssembly will change the way we build
    software for the next wave of cloud-computing
    • Serverless AI allows application developers to add generative AI capabilities to their
    apps in “no-time”
    • Although we have access to Llama2 and CodeLlama now, we will see more models in
    Fermyon Cloud soon
    Conclusion

    View full-size slide

  24. Thanks for your attention
    @ThorstenHans @Thinktecture

    View full-size slide