More and more developers intend to integrate Generative AI features into their applications. Until now, this path has practically always led to the cloud—but it doesn't have to be like that! Currently, there are various promising approaches to running AI models directly on the user's computer: Hugging Face, for example, offers the possibility of using machine learning models directly in the browser with Transformers.js. The W3C's Web Neural Network API (WebNN), which is still in the specification phase, will grant such models access to the device's Neural Processing Unit (NPU). This will allow Large Language Models (LLM) or stable diffusion models to be run efficiently in the browser. The advantages of these approaches are obvious: Locally executed AI models are also available offline, the user data does not leave the device, and all this is even free of charge thanks to open-source models. But of course, the model must first be transferred to the user's device, which must also be sufficiently powerful. In this talk, Christian Liebel, Thinktecture's representative at W3C, will present the approaches to make your single-page app smarter. We will discuss use cases and show the advantages and disadvantages of each solution.