Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Offline Chatbots in Web using PromptAPI

Offline Chatbots in Web using PromptAPI

Avatar for Simar Preet Singh

Simar Preet Singh

December 27, 2025
Tweet

More Decks by Simar Preet Singh

Other Decks in Programming

Transcript

  1. ~whoami 1. Software Engineer with more than 8 years of

    experience. 2. Working at Redaptive Inc. 3. Community Organiser at GDG Jalandhar 4. Expertise in Angular, React, Ionic, Capacitor, Nodejs, Expressjs, Firebase, MongoDB etc. 5. Do a lot of experimenting with Gen AI 6. Love contributing to DevCommunity
  2. Agenda 1. AI in your Browser 2. Chrome’s on-device AI:

    Prompt API 3. Putting it to work 4. On Device vs The Cloud: Trade-offs 5. The Future of Web AI 6. Demo 7. Q&A
  3. Al Inference is Moving from the Cloud to the Client.

    Traditionally, Al features required sending user data to cloud servers for processing. Device's built-in Al, accessed via some APIs, runs models directly in the browser, changing the fundamentals of how we build intelligent applications.
  4. The Four Pillars of the On-Device Advantage Running Al on

    the client, unlocks significant, advantages for both developers and users. Privacy & Security User data never leaves the device. Essential for sensitive content like drafts or personal information. Responsiveness & Reliability Low latency and offline functionality. Performance is consistent and independent of network conditions. Ease of Development A simple JavaScript API that doesn't require expertise in machine learning. No complex backend setup or MLOps required. Zero Cost Inference is free. No server costs or API fees, enabling unlimited iteration and scalability.
  5. Meet Gemini Nano: Your Client Side Foundation Model. Gemini Nano,

    a highly efficient large language model from Google. It is designed specifically for on-device tasks, ensuring high performance without sending data to external servers.
  6. Your Development Environment Checklist Built-in Al is available for local

    development on 'localhost' and in an Origin Trial. Here's what you and your users need. Chrome Version 138+ (Canary recommended). Hardware Requirements At least 22GB of free storage space. CPU: 16GB+ RAM. GPU: More than 4GB of VRAM. Enable Flags 'chrome://flags/#optimization- guide-on-device-model' 'chrome://flags/#prompt-api-for- gemini-nano-multimodal-input' TypeScript Typings For easier development, use the @types/dom-chromium-ai npm package.
  7. The Core API Workflow: A Simple Three-Step Interaction Interacting with

    the Prompt API follows a clear and logical pattern that will be familiar to any web developer. Create Session LanguageModel.create() Initialise the model with context, parameters, and expected input types. 1 Prompt Model session.prompt() or session.promptStreaming() Send the user's request, either awaiting a full response or streaming it chunk by chunk. 2 Destroy Session session.destroy() Clean up and free resources when the session is no longer needed. 3
  8. Code Deep Dive: Creating and Prompting a Session First, check

    for API availability. Then, initialise a session with a "systemPrompt” to define its persona and context before sending the user's prompt. // 1. Check if API is available and ready if ((await LanguageModel.availability()) == ‘unavailable’) { // Handle gracefully: AI not available } // 2. Create a session with a systemPrompt for context const session = await LanguageModel.create({ systemPrompt: `You are a helpful assistant for to-do list app. This is the list in JSON: ${JSON.stringify(todos)`} }); // 3. Prompt for single, complete response const result = await session.prompt(‘How many open to-dos do I have?’); console.log(result);
  9. Multimodal Prompt API: It doesn’t just read, it sees and

    hear you as well The Prompt API isn’t limited to text. It supports multimodal inputs. Your new AI can critique your terrible drawings and transcribe your rambling voice notes. All locally, all privately. const session = await LanguageModel.create({ expectedInputs: [{ type: ‘image’ }] }); const result = await session.prompt([ ‘Describe this image for alt text: ’, { type: ‘image’, content: imageBitmap } ]); const session = await LanguageModel.create({ expectedInputs: [{ type: ‘audio’ }] }); const result = await session.prompt([ ‘Transcribe this audio ’, { type: ‘audio’, content: audioBuffer } ]); Image Input Audio Input
  10. When and What: Using right approach for right job The

    Prompt API isn’t limited to text. It supports multimodal inputs. Your new AI can critique your terrible drawings and transcribe your rambling voice notes. All locally, all privately. Privacy sensitive tasks, low-latency needs. 1. Summarizing on-page text. 2. Generating blog titles and drafts. 3. Text Classification. 4. Simple Q&A. For your deep, personal conversations Massive world knowledge, heavy computation. 1. Complex research. 2. Global trend analysis. 3. Generating a novel from one-word prompt. For planning your next vacation to Jaipur, On Device AI Cloud Based AI
  11. Limitations of Prompt API: Everything has its quirks. Let’s be

    real, everything is not perfect all the time. Inconsistent Output Token Limits Smaller Brain Sometimes you get a single word, sometimes a novel. You may need to add a workaround to ensure the output is always in a correct format (like valid JSON). It’s smart but it’s smaller than a server-based model. It’s critical to provide sufficient context within the prompt to get good results. It has a finite context window. You need to manage sessions and tokens (`input usage`, ‘input quota’) to keep the conversation on track.
  12. The future is a healthy hybrid relationship! • On-device AI

    offers private, fast, and free alternatives to many AI tasks. • It’s a healthier relationship for your app and your users. • The smart strategy is Hybrid: use the right AI for right job.