Five principles for building generative AI products

@arfon Product Club Five principles for building generative AI products

Product Club (Generative) AI is here… The people around you
are likely using it There is a “there” there Moore’s law will continue to hold People are going try all sorts of crazy sh*t Product managers at GitHub should have opinions Some working assumptions… Whether it’s ChatGPT or GitHub Copilot, these are technologies people are using. Generative AI can be genuinely useful when applied to the right problems. Models will likely become more capable, costs will reduce, more will be possible for less. Just check your favourite tech news site. And I’m sharing mine with you today.

Product Club Some further framing LLMs are remarkably versatile They
are changing how people work There will be much more code in the future Lots of tools (and models) to experiment with There’s real work to do to build responsibly Some working assumptions… Answer questions, summarise content, generate code, understand sentiment. Cost of exploring an idea is trending towards zero. Including written by many people who currently aren’t (lower barrier to entry). GitHub Copilot, Azure AI, Open AI, AWS Bedrock, Hugging Face, Replicate. Please learn about responsible AI development. github/rai is a good place to start.

Product Club

Product Club No decisi on s

Product Club Copilot in the ‘right seat’ AIs make mistakes
– building assistive experiences signi fi cantly safer than ones that take action directly. Copilot is designed to help you make better, faster decisions, not make decisions for you. No decisions

Product Club

Product Club Design f or fail ur e

Product Club What is the cost of getting it wrong?
If an error occurs, is the user likely to spot it? What is the cost to the user if they don't? If detected, how easy is it to dismiss an error and move on? Can *you* measure a good or bad outcome? Design for failure

Product Club

Product Club AI/LLM hallucinations † An inherent characteristic of LLMs
Hallucinations are unavoidable Mild hallucinations are very common Lots of techniques and tools exist for mitigating them Did I say they are unavoidable? They are *always* hallucinating, it’s just sometimes those hallucinations are useful. But can be reduced through a variety of techniques including tuning, prompting, context. Design for roughly 50% of the time. Severe ones less common, especially with grounding. Active area of research and development. Responsible development means only building systems where hallucinations aren’t dealbreakers. A quick aside about…

Product Club Gr ou nded in reality

Product Club Example of optimising the output of large language
models Many, many options to improve model outputs. Retrieval-Augmented Generation is probably the best ‘bang for your buck’. Most use cases and customers *do not* require custom models. Grounded in reality

Product Club Retrieval-Augmented Generation 🧑💻 LLM Find related information Prompt
+ user query Prompt + user query + retrieved context Blackbird, Git/Spokes, GitHub API/graph

Product Club Retrieval Augmented Generation † Lots of di ff
erent ways to do the ‘retrieval’ Essentially giving the model relevant input context Combined with links/citations can increase utility Allows base models to ‘work’ for specialised areas Typically much cheaper than other optimisations Keyword search, vector search, other ‘similarity-type’ searches. Steers the model outputs towards better/more relevant answers. Allow users to follow relevant links, decide which of them are most relevant. For example if the topic being discussed isn’t ‘knowable’ from public information. e.g., fi ne-tuned (or completely custom) models.

Product Club

Product Club Invite on ly

Product Club How can I help you today? LLMs can
power remarkable conversational experiences (e.g., natural language — source code). *But remember*, small hallucinations are very common. Only surprise if you can delight. Invite only

Product Club Explain like it’s me

Product Club Leveraging the LLM’s ability to personalise responses ELI5:
Explain Like I’m Five. ELIKL: Explain Like I know Lots. ELIAGPTDKMR: Explain Like I’m A Go Programmer That Doesn’t Know Much Ruby. Explain like it’s me

Product Club No decisi on s • Design f or
fail ur e • Gr ou nded in reality • Invite on ly • Explain like it’s me

Product Club Thanks! [email protected] Come say hi in: #copilot •
#copilot-api • #copilot-skills • #copilot-core-productivity

Five principles for building generative AI prod...

Five principles for building generative AI products

Arfon Smith

More Decks by Arfon Smith

Featured

Transcript

@arfon Product Club Five principles for building generative AI products

Product Club (Generative) AI is here… The people around you

Product Club Some further framing LLMs are remarkably versatile They

Product Club

Product Club

Product Club No decisi on s

Product Club Copilot in the ‘right seat’ AIs make mistakes

Product Club

Product Club Design f or fail ur e

Product Club What is the cost of getting it wrong?

Product Club

Product Club AI/LLM hallucinations † An inherent characteristic of LLMs

Product Club Gr ou nded in reality

Product Club Example of optimising the output of large language

Product Club Retrieval-Augmented Generation 🧑💻 LLM Find related information Prompt

Product Club Retrieval Augmented Generation † Lots of di ff

Product Club

Product Club Invite on ly

Product Club How can I help you today? LLMs can

Product Club Explain like it’s me

Product Club Leveraging the LLM’s ability to personalise responses ELI5:

Product Club No decisi on s • Design f or

Product Club Thanks! [email protected] Come say hi in: #copilot •