Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AI agents: Fine grained Voice-control for your app

AI agents: Fine grained Voice-control for your app

AI agents are everywhere, for a good reason! In this session, Max Marschall presents how you can augment your existing web app with client-side agents to control it via voice input, effectively improving accessibility for everyone and creating an assistant for your users. It will be able to navigate to data sets, jump to other pages, press buttons, and more. The capabilities & limits are set by us developers, controlling what is exposed to the LLM in terms of functionality and state.
Join Max to hear about the pitfalls and proven practices to reach a stable result for your next gen web app.

Max Schulte

November 12, 2024
Tweet

More Decks by Max Schulte

Other Decks in Programming

Transcript

  1. Artificial Intelligence Machine Learning Data Science NLP A Deep Learning

    Generative AI Large Language Models Image / Video Generation
  2. Bla bla bla … “Augmenting an App with an assistant,

    that helps the user based on displayed context. Allowing an assistant to act for me in that context.” Intention
  3. Intention Bla bla bla … “Augmenting an App with an

    assistant, that helps the user based on displayed context. Allowing an assistant to act for me in that context.” Problem: Integration GenAI Metainformation Tool calls Leaking secrets System integration Solutions AST parsing TS Decorators Client-side "agents" Barebones Angular OpenAI
  4. User Prompt • Voice input • Text input • Image

    input • Unstructured • Universal interface • Natural language as first-class input
  5. • Prompts • RAG • Sanitising • Guards • Vector-Database

    • Tool selection • Additional data (Processing)
  6. • System prompt • Base information • Rules • Intention

    • Output format Processing You are an friendly helpfull assistent proficient in creating and managing forms. You will receive information on the state of an app. If you feel there a … System Prompt https://docs.anthropic.com/en/release-notes/system-prompts#oct-22nd-2024 You are a data analyst API capable of sentiment analysis that responds in JSON. The JSON schema should include { "sentiment_analysis": { "sentiment": "string (positive, negative, neutral)", "confidence_score": "number (0-1)" # Include additional fields as required } https://console.groq.com/docs/text-chat Prompts
  7. • State • Additional information • Generation examples Context /**

    Additional Information: basic list of static pages shown in the side navigation. */ [ { title: 'Contributions', icon: 'question_answer', link: ['/contributions'] }, { title: 'Conferences', icon: 'podium', link: ['/conferences'] }, { title: 'Speakers', icon: 'person', link: ['/speakers'] }, { title: 'Collections', icon: 'category', link: ['/collections'] }, ];
  8. Tools & Functions /** * Accepts a resolved sitemap entry

    and navigates to it. * * e.g. path: /collections/:id * ":id" is replaced by an actual entity id. */ public navigate(resolvedPath: string): void { this.router.navigateByUrl('/' + resolvedPath); }
  9. Tools & Functions /** * Accepts a resolved sitemap entry

    and navigates to it. * * e.g. path: /collections/:id * ":id" is replaced by an actual entity id. */ @AiTool public navigate(resolvedPath: string): void { this.router.navigateByUrl('/' + resolvedPath); } { "type": "function", "function": { "name": "navigate", "description": "Accepts a resolved sitemap entry …, "parameters": { "type": "object", "properties": { "resolvedPath": { "title": "resolvedPath", "description": "", "type": "string" } } }, "required": ["resolvedPath"], } } AST parsing
  10. Tools & Functions { "HeaderFormField": { "description": "Header shown in

    a form. The label is… "type": "object", "properties": { "type": { "description": "Type of form field.", "type": "string", "const": "title" }, … "required": { "description": "Whether the field is required… "type": "boolean" }, … }, "required": ["key", "label", "type"] }, /** Base type for all form field types. */ interface BaseFormField { /** Type of form field. */ type: FormFieldType; /** Description of the Field. What it is and what values should be there. */ description?: string; /** Form field label shown to the user. */ label: string; /** Whether the field is required. Defaults to false. */ required?: boolean; /** Key to store the value under. */ key: string; } /** Header shown in a form. The label is used as its text. */ interface HeaderFormField extends BaseFormField { type: FormFieldType.Title; }
  11. Tools & Functions /** * Creates form fields for a

    provided list of formField definition. */ @AiTool createFormFields(fields: FormField[]): FormField[] { return fields; } /** Union of all supported form field types. */ export type FormField = | HeaderFormField | SubheaderFormFiel | … { "type": "function", "className": "FormBuilderService", "classDocumentation": "", "function": { "name": "createFormFields", "title": "createFormFields", "type": "function", "description": "Creates form fields for a provided list… "parameters": { "type": "object", "properties": { "fields": { "title": "fields", "description": "Union of all supported form… "anyOf": [ { "$ref": "#/definitions/HeaderFormField" }, …
  12. Large Language Model • Transformer Neural Network • Context &

    relation aware • Pattern matching & recognition • “Natural” language understanding • Generating / creating content • Summarisation
  13. Large Language Model • It cannot make decisions • It

    cannot understand non-linear content • It cannot draw conclusions
  14. Agents /** * Automates a process based on a user

    query. * @param content - The query from the user. */ async automate(content: string): Promise<void> { // get the current available tools and application state const { tools, state } = this.createToolState(); const message: ChatCompletionMessageParam = { role: 'user', content, }; // make backend request, choices that the model makes // -> messages and tool calls const choices = await this.aiBackend.automate([message], tools, state, trace); if (!choices) { … handle error } // loop and automate, handle llm choice internally for (const choice of choices) { this.handleChoice(choice, [message], tools, state, trace); } }
  15. Agents /** * Automates a process based on a user

    query. * @param content - The query from the user. */ async automate(content: string): Promise<void>
  16. /** * Automates a process based on a user query.

    * @param content - The query from the user. */ async automate(content: string): Promise<void> Agents /** * Handles a choice made by the AI model. * @param choice - The choice made by the model. * @param chat - The current chat history. * @param tools - Available tools for the AI. * @param state - Current application state. * @param trace - Langfuse trace client for logging. */ private async handleChoice( choice: Choice, chat: ChatCompletionMessageParam[], tools: Tool[], state?: any, trace?: LangfuseTraceClient, ): Promise<void> { // handle tool call, can be one or more if (choice.finish_reason === 'tool_calls') { // display information to the user this.createToolMessage(choice).forEach(message => this.message$.next(message)); // make the actual function call and create a message from it const toolResult = this.handleToolCalls(choice); // respond to the API with made calls if (toolResult) { chat.push(choice.message, ...toolResult); (await this.aiBackend.automate(chat, tools, state, trace)).map(choice => this.handleChoice(choice, chat, tools, state, trace), ); } } else { this.message$.next({ role: choice.message.role, message: choice.message.content ?? 'NO MESSAGE', });
  17. /** * Automates a process based on a user query.

    * @param content - The query from the user. */ async automate(content: string): Promise<void> Agents /** * Handles a choice made by the AI model. * @param choice - The choice made by the model. * @param chat - The current chat history. * @param tools - Available tools for the AI. * @param state - Current application state. * @param trace - Langfuse trace client for logging. */ private async handleChoice( choice: Choice, chat: ChatCompletionMessageParam[], tools: Tool[], state?: any, trace?: LangfuseTraceClient, ): Promise<void> { // handle tool call, can be one or more if (choice.finish_reason === 'tool_calls') { // display information to the user this.createToolMessage(choice).forEach(message => this.message$.next(message)); // make the actual function call and create a message from it const toolResult = this.handleToolCalls(choice); // respond to the API with made calls if (toolResult) { chat.push(choice.message, ...toolResult); (await this.aiBackend.automate(chat, tools, state, trace)).map(choice => this.handleChoice(choice, chat, tools, state, trace), ); } } else { this.message$.next({ role: choice.message.role, message: choice.message.content ?? 'NO MESSAGE', }); /** * Handles tool calls made by the AI model. * @param param0 - Object containing the message with tool calls. * @returns An array of tool message params. */ private handleToolCalls({ message }: Pick<Choice, 'message'>): ChatCompletionToolMessageParam[] { // iterate through tool calls in choices. return message.tool_calls!.map(call => { // find the matching instance for this tool const { name, guid } = this.splitFunctionName(call.function.name); const toolInstance = this.findToolInstance(name, guid); let content = ''; // call the instance function with params const instance = toolInstance?.instance as any; if (instance && name in instance) { const toolFunction = instance[name]; content = JSON.stringify( toolFunction.call(instance, ...Object.values(JSON.parse(call.function.arguments))) ?? '', ); // message user about tool call results … } return { content, tool_call_id: call.id, role: 'tool', name: call.function.name, }; }); }
  18. Agents /** * Automates a process based on a user

    query. * @param content - The query from the user. */ async automate(content: string): Promise<void> /** * Handles a choice made by the AI model. * @param choice - The choice made by the model. * @param chat - The current chat history. * @param tools - Available tools for the AI. * @param state - Current application state. * @param trace - Langfuse trace client for logging. */ private async handleChoice( choice: Choice, chat: ChatCompletionMessageParam[], tools: Tool[], state?: any, trace?: LangfuseTraceClient, ): Promise<void> { // handle tool call, can be one or more if (choice.finish_reason === 'tool_calls') { // display information to the user this.createToolMessage(choice).forEach(message => this.message$.next(message)); // make the actual function call and create a message from it const toolResult = this.handleToolCalls(choice); // respond to the API with made calls if (toolResult) { chat.push(choice.message, ...toolResult); (await this.aiBackend.automate(chat, tools, state, trace)).map(choice => this.handleChoice(choice, chat, tools, state, trace), ); } } else { this.message$.next({ role: choice.message.role, message: choice.message.content ?? 'NO MESSAGE', }); /** * Handles tool calls made by the AI model. * @param param0 - Object containing the message with tool calls. * @returns An array of tool message params. */ private handleToolCalls({ message }: Pick<Choice, 'message'>): ChatCompletionToolMessageParam[] { // iterate through tool calls in choices. return message.tool_calls!.map(call => { // find the matching instance for this tool const { name, guid } = this.splitFunctionName(call.function.name); const toolInstance = this.findToolInstance(name, guid); let content = ''; // call the instance function with params const instance = toolInstance?.instance as any; if (instance && name in instance) { const toolFunction = instance[name]; content = JSON.stringify( toolFunction.call(instance, ...Object.values(JSON.parse(call.function.arguments))) ?? '', ); // message user about tool call results … } return { content, tool_call_id: call.id, role: 'tool', name: call.function.name, }; }); }
  19. Agents /** * Automates a process based on a user

    query. * @param content - The query from the user. */ async automate(content: string): Promise<void> /** * Handles a choice made by the AI model. * @param choice - The choice made by the model. * @param chat - The current chat history. * @param tools - Available tools for the AI. * @param state - Current application state. * @param trace - Langfuse trace client for logging. */ private async handleChoice( choice: Choice, chat: ChatCompletionMessageParam[], tools: Tool[], state?: any, trace?: LangfuseTraceClient, ): Promise<void> { // handle tool call, can be one or more if (choice.finish_reason === 'tool_calls') { // display information to the user this.createToolMessage(choice).forEach(message => this.message$.next(message)); // make the actual function call and create a message from it const toolResult = this.handleToolCalls(choice); // respond to the API with made calls if (toolResult) { chat.push(choice.message, ...toolResult); (await this.aiBackend.automate(chat, tools, state, trace)).map(choice => this.handleChoice(choice, chat, tools, state, trace), ); } } else { this.message$.next({ role: choice.message.role, message: choice.message.content ?? 'NO MESSAGE', }); /** * Handles tool calls made by the AI model. * @param param0 - Object containing the message with tool calls. * @returns An array of tool message params. */ private handleToolCalls({ message }: Pick<Choice, 'message'>): ChatCompletionToolMessageParam[] { // iterate through tool calls in choices. return message.tool_calls!.map(call => { // find the matching instance for this tool const { name, guid } = this.splitFunctionName(call.function.name); const toolInstance = this.findToolInstance(name, guid); let content = ''; // call the instance function with params const instance = toolInstance?.instance as any; if (instance && name in instance) { const toolFunction = instance[name]; content = JSON.stringify( toolFunction.call(instance, ...Object.values(JSON.parse(call.function.arguments))) ?? '', ); // message user about tool call results … } return { content, tool_call_id: call.id, role: 'tool', name: call.function.name, }; }); }
  20. /** * Automates a process based on a user query.

    * @param content - The query from the user. */ async automate(content: string): Promise<void> Agents /** * Handles a choice made by the AI model. * @param choice - The choice made by the model. * @param chat - The current chat history. * @param tools - Available tools for the AI. * @param state - Current application state. * @param trace - Langfuse trace client for logging. */ private async handleChoice( choice: Choice, chat: ChatCompletionMessageParam[], tools: Tool[], state?: any, trace?: LangfuseTraceClient, ): Promise<void> { // handle tool call, can be one or more if (choice.finish_reason === 'tool_calls') { // display information to the user this.createToolMessage(choice).forEach(message => this.message$.next(message)); // make the actual function call and create a message from it const toolResult = this.handleToolCalls(choice); // respond to the API with made calls if (toolResult) { chat.push(choice.message, ...toolResult); (await this.aiBackend.automate(chat, tools, state, trace)).map(choice => this.handleChoice(choice, chat, tools, state, trace), ); } } else { this.message$.next({ role: choice.message.role, message: choice.message.content ?? 'NO MESSAGE', });
  21. Agents private async handleChoice( … ): Promise<void> { if (choice.finish_reason

    === 'tool_calls') { // display information to the user this.createToolMessage(choice).forEach(message => this.message$.next(message)); // make the actual function call and create a message from it const toolResult = this.handleToolCalls(choice); // respond to the API with made calls if (toolResult) { chat.push(choice.message, ...toolResult); (await this.aiBackend.automate(chat, tools, state, trace)).map(choice => this.handleChoice(choice, chat, tools, state, trace), ); } } else { … show result message to user } } } /** * Automates a process based on a user query. * @param content - The query from the user. */ async automate(content: string): Promise<void>
  22. Agents private async handleChoice( … ): Promise<void> { if (choice.finish_reason

    === 'tool_calls') { // display information to the user this.createToolMessage(choice).forEach(message => this.message$.next(message)); // make the actual function call and create a message from it const toolResult = this.handleToolCalls(choice); // respond to the API with made calls if (toolResult) { chat.push(choice.message, ...toolResult); (await this.aiBackend.automate(chat, tools, state, trace)).map(choice => this.handleChoice(choice, chat, tools, state, trace), ); } } else { … show result message to user } } } /** * Automates a process based on a user query. * @param content - The query from the user. */ async automate(content: string): Promise<void>
  23. Agents /** * Handles a choice made by the AI

    model. */ private async handleChoice(…): Promise<void> /** * Automates a process based on a user query. * @param content - The query from the user. */ async automate(content: string): Promise<void> /** * Handles tool calls made by the AI model. * @param param0 - Object containing the message with tool calls. * @returns An array of tool message params. */ private handleToolCalls({ message }: Pick<Choice, 'message'>): ChatCompletionToolMessageParam[] {
  24. Tools from Codebase /** Base type for all form field

    types. */ interface BaseFormField { /** Type of form field. */ type: FormFieldType; /** Description of the Field. What it is and what values should be there. */ description?: string; /** Form field label shown to the user. */ label: string; /** Whether the field is required. Defaults to false. */ required?: boolean; /** Key to store the value under. */ key: string; } /** Header shown in a form. The label is used as its text. */ interface HeaderFormField extends BaseFormField { type: FormFieldType.Title; } { "HeaderFormField": { "description": "Header shown in a form. The label is… "type": "object", "properties": { "type": { "description": "Type of form field.", "type": "string", "const": "title" }, … "required": { "description": "Whether the field is required… "type": "boolean" }, … }, "required": ["key", "label", "type"] }, AST Parsing
  25. Tools from Codebase /** * Retrieve a list of entries

    for a specific type. */ @AiTool public getCollection(collectionType: CollectionType): object[] { switch (collectionType.type) { case 'contributions': return this.entryStore.contributions(); case 'speakers': return this.entryStore.speakers(); case 'collections': return []; case 'conferences': return this.entryStore.conferences(); default: return []; } } { "type": "function", "classDocumentation": "", "function": { "title": "getCollection", "description": "Retrieve a list of entries for …”, "parameters": { "title": "collectionType", "description": "", "type": "object", "properties": { "type": { "description": "Type of different data sets…” "enum": [ “collections", "conferences", "contributions", "speakers" ], "type": "string" } }, "required": [ "type" AST Parsing
  26. @Directive({ selector: '[aiClick]', standalone: true, }) export class AiClickDirective {

    private element = inject(ElementRef); metaInfo = input.required<string>({ alias: 'aiClick' }); guid = input<string>(crypto.randomUUID()); constructor() { registerInstance(() => ({ guid: this.guid(), instance: this, state: this.metaInfo(), className: 'AiClickDirective', metaInfo: this.metaInfo(), })); } @AiTool click(): void { this.element.nativeElement.click(); } } Tools from Codebase
  27. export class AiClickDirective { … constructor() { registerInstance(() => ({

    guid: this.guid(), instance: this, state: this.metaInfo(), className: 'AiClickDirective', metaInfo: this.metaInfo(), })); } @AiTool click(): void { this.element.nativeElement.click(); } } Tools from Codebase
  28. export class AiClickDirective { … constructor() { registerInstance(() => …);

    } … } export function registerInstance(info: () => RegisterInfo) { const aiService = inject(AiService); const destroyRef = inject(DestroyRef); aiService.registerInstance(info); destroyRef.onDestroy(() => aiService.removeInstance(info)); } Tools from Codebase
  29. export class AiClickDirective { … constructor() { registerInstance(() => …);

    } … } export function registerInstance(info: () => RegisterInfo) { const aiService = inject(AiService); const destroyRef = inject(DestroyRef); aiService.registerInstance(info); destroyRef.onDestroy(() => aiService.removeInstance(info)); } private handleToolCalls({ message }: Pick<Choice, 'message'>): ChatCompletionToolMessageParam[] { // iterate through tool calls in choices. return message.tool_calls!.map(call => { // find the matching instance for this tool const { name, guid } = this.splitFunctionName(call.function.name); const toolInstance = this.findToolInstance(name, guid); let content = ''; // call the instance function with params const instance = toolInstance?.instance as any; if (instance && name in instance) { const toolFunction = instance[name]; content = JSON.stringify( toolFunction.call(instance, ...Object.values(JSON.parse(call.function.arguments))) ?? '', ); } … }); } Tools from Codebase
  30. export class AiClickDirective { … constructor() { registerInstance(() => …);

    } … } export function registerInstance(info: () => RegisterInfo) { const aiService = inject(AiService); const destroyRef = inject(DestroyRef); aiService.registerInstance(info); destroyRef.onDestroy(() => aiService.removeInstance(info)); } /** * Finds a tool instance based on the function name and optional GUID. * … */ private findToolInstance(name: string, guid?: string): RegisterInfo | undefined { const className = this.tools.find(tool => tool.function.name === name)?.className; return this.registeredInstances.find(info => { const { className: infoClassName, guid: infoGuid } = info(); return className === infoClassName && guid === infoGuid; })?.(); } Tools from Codebase
  31. • “Talking to fast learning & smart junior" • No

    default assumptions • Documentation is key • Examples are great Development Mindset
  32. • Avoid auto-commit • Enable feedback • Highlight probability •

    Clear entry point into AI workflows • Progress / generation indicators • Suggest common actions UI & UX
  33. • Client-side for PoC • Easy transition to server-side •

    Write docs for dummies • Unambiguous tools • Unambiguous context • An Agent is just a loop LLMs do not actually call functions - you do! Lessons Learned