apidays Australia 2023 - How We Built Our Generative AI Assistant: New Relic Grok, Peter Marelas, New Relic

© 2023 New Relic, Inc. All rights reserved How we
built our Generative AI assistant New Relic Grok Peter Marelas Chief Architect, APJ New Relic

© 2023 New Relic, Inc. All rights reserved Collect Applications,
Web, Mobile, Cloud. IoT, etc New Relic Cloud Observability Platform Infrastructure Security DevOps Web AI/ML Mobile Network SRE Back-End Full-Stack Cloud Kubernetes Synthetics Serverless APM Model Performance Network Browser Mobile Infrastructure Distributed Tracing Log Management AIOps Full Stack O11y Telemetry Data Platform ✓ ✓ ✓ No Team Silos Only pricing model For ubiquity and scale No Data Silos Only purpose built Telemetry data cloud No Tool Silos All monitoring and security Tools in one connected experience Store Filter, enrich, build relationships - system, software, users, topology maps, etc Visualise Real-time dashboards, service maps, query builders, curated experiences Analyse Correlation, causal analysis, trends, anomaly detection, real-time alerting, health indicators

© 2023 New Relic, Inc. All rights reserved. Motivation How
do I ….?

© 2023 New Relic, Inc. All rights reserved. Motivation What
is ….?

© 2023 New Relic, Inc. All rights reserved. Motivation Peak
of hype cycle creates customer expectation.. * Gartner Hype Cycle for Artiﬁcial Intelligence, 2023

© 2023 New Relic, Inc. All rights reserved. Grok has
4 specific skills Skill Common NL prefix Tool Source of Knowledge Answer questions about New Relic How do I … NL 2 Docs New Relic Documentation Answer questions about users data What … How many … NL 2 NRQL NRDB Check if any problems or anomalies with users environment Are … NL 2 Anomalies NRDB Interpret users dashboards What is … NL 2 Dashboards Dashboard definition, NRDB

© 2023 New Relic, Inc. All rights reserved. How Grok
decides what skill (tool) to use? “What is my transaction count?” Ask LLM to pick tool for instruction given a description of each tool NL 2 NRQL NL 2 Docs NL 2 NRQL NL 2 Dashboards NL 2 Anomalies

processes NL 2 NRQL requests “What is my transaction count?” Ask LLM to pick most relevant tables relating to user’s question Ask LLM to generate NRQL from prompt Get schema for these tables as metadata Validate query is syntactically correct Combine metadata, examples and user’s question into prompt Execute NRQL Render chart and natural language response Pass response to LLM and ask to render natural text response Ask LLM to correct query Retrieve similar examples of Q/NRQL pairs from vector database Y

processes NL 2 DOCS How do I….? Convert question to embeddings using LLM Generate prompt with question and relevant text passages Search vector database for similar embeddings Pass prompt to LLM to render natural response from passages in prompt Extract passages of text associated with similar embeddings Render response to user In-context learning with Retrieval Augmented Generation (RAG)

© 2023 New Relic, Inc. All rights reserved. Challenges &
Solutions

© 2023 New Relic, Inc. All rights reserved. Natural Language
Instructions Deterministic Output What we want from an AI assistant.. Speciﬁc Knowledge DBs Speciﬁc Rule Interpreters General Output Formats

© 2023 New Relic, Inc. All rights reserved. Natural Language
Instructions Deterministic Output Natural Language Instructions Creative Output Generic Knowledge DBs Generic Rule Interpreters Generic Output Formats What we want from an AI assistant.. Foundational LLMs Speciﬁc Knowledge DBs Speciﬁc Rule Interpreters General Output Formats

© 2023 New Relic, Inc. All rights reserved. Generic Knowledge
DBs Generic Rule Interpreters Generic Output Formats * Deterministic Output Specific Knowledge DB Specific Rules Natural Language Instructions Foundational LLM + Retrieval Augmented Generation Natural Language Instructions Deterministic Output Natural Language Instructions Creative Output Generic Knowledge DBs Generic Rule Interpreters Generic Output Formats What we want from an AI assistant.. Foundational LLMs Specific Knowledge DBs Specific Rule Interpreters General Output Formats

© 2023 New Relic, Inc. All rights reserved. What questions
do our users want to ask? User Study 79% said they wanted to learn something about a capability or get insights from their own dataset.

© 2023 New Relic, Inc. All rights reserved. Finding right
prompts (Prompt Engineering) ▪ Ongoing reﬁnement (edge cases) ▪ Add examples to prompts (fewshot) ▪ Add rules to prompt ▪ Feedback mechanism ▪ Robust test harness ▪ ROGUE & BERT scores ▪ 2nd LLM to assess quality <Context information>: You are an AI assistant specialized in translating user questions into New Relic Query Language (NRQL), with no knowledge of SQL. Given a user's question, information about the user, descriptions of event schemas, and examples of questions and answers, your task is to generate an appropriate NRQL query. The provided event schemas contain only the most relevant ones and you need to use only one. In the context of New Relic, an entity is a basic data reporting element, such as an application, host, or database service; each entity has a unique Guid, which is a base64-encoded unique identifier; and if a user references an entity by its Guid, you should use it in the NRQL you generate, but if entity guid is not explicitly referenced, you should not use in the query that you will generate. The wording of the question should tell you whether the user wants totals or data over a time interval. Use TIMESERIES clause in the NRQL query that you generate only if the user requests data over time or per day/hour. Otherwise, do not use it. <How to select time range in NRQL queries>: Every NRQL query should contain a SINCE and may contain an UNTIL clause, as this is the only viable way to select a time range in NRQL. If the SINCE clause is not used, the query uses the last 1 hour of data by default, but you should always use the SINCE clause in the query you generate, and if the time range is not explicitly specified, use SINCE 1 hour ago. <Examples of valid NRQL queries with time range selections>: <User question>: How many transactions happened today? <NRQL query>: SELECT count(*) FROM Transaction SINCE TODAY <User question>: How many transactions happened on 25th of April? <NRQL query>: FROM Transaction SELECT count(*) SINCE '2023-04-25 00:00:00' UNTIL '2023-04-25 23:59:59' <User question>: How many transactions happened in the previous calendar week? <NRQL query>: FROM Transaction SELECT COUNT(*) SINCE LAST WEEK UNTIL THIS WEEK <User question>: How many transactions happened on Monday? <NRQL query>: FROM Transaction SELECT count(*) SINCE MONDAY until TUESDAY <User question>: How many transactions per day occurred this year until 10 days ago <NRQL query>: FROM Transaction SELECT count(*) SINCE THIS YEAR until 10 days ago TIMESERIES 1 day System instruction Examples Rules User question

© 2023 New Relic, Inc. All rights reserved. Performance ▪
High variance ▪ Time to intermediate token ▪ Time to ﬁrst token ▪ Time to last token ▪ Intermediate messages ▪ Distribute requests ▪ Cache some answers

© 2023 New Relic, Inc. All rights reserved. Microsoft Azure
OpenAI Service Cost of LLM Model Prompt (1000 tokens) Completion (1000 tokens) GPT-4 $0.003 $0.006 Ada (embeddings) $0.0001 Query Avg prompt tokens Avg completion tokens Avg cost per e2e request NL2Docs 3016 568 $0.13 NL2NRQL 6516 118 $0.20 New Relic Grok Users Daily docs requests Daily NL2NRQL requests Monthly cost 1 user 5 5 $49 100 users 500 500 $4,900 10,000 users 50,000 50,000 $490,000

© 2023 New Relic, Inc. All rights reserved. Deprecation of
LLMs ▪ Robust Test Harness ▪ ROUGE – Quantify overlap of words between generated output and reference text ▪ BERTScore – semantic similarity ▪ Use GPT4 to evaluate ($’s)

© 2023 New Relic, Inc. All rights reserved. LLM Rate
Limits ▪ 40,000 / 5200 = 7 requests/min ▪ Multiple endpoints ▪ Queue ▪ Distribute requests ▪ Limit max completion tokens (counts towards token-per-minute limit)

© 2023 New Relic, Inc. All rights reserved. LLM Context
Length Limits ▪ GPT-4 8192 context length ▪ Prompt + completion within context length to avoid hallucinations ▪ Transform prompt ▪ Remove extra spaces ▪ Remove pronouns ▪ Convert JSON to CSV

© 2023 New Relic, Inc. All rights reserved. LLM have
no knowledge after 2021 ▪ In-context learning ▪ Pass question + relevant docs ▪ Only as good as algo used to ﬁnd relevant docs ▪ Cross-encoder re-ranking

© 2023 New Relic, Inc. All rights reserved. How are
similar documents / examples found? passages doc Text Embedding [0.354, 0.234, … , 0.87] 1536 dimensions Vector DB Indexing Search text Text Embedding [0.354, 0.234, … , 0.87] 1536 dimensions Vector DB Return TopK (maxmarginal relevancy convert search convert store Indexed by embedding

apidays Australia 2023 - How We Built Our Gener...

apidays Australia 2023 - How We Built Our Generative AI Assistant: New Relic Grok, Peter Marelas, New Relic

apidays PRO

More Decks by apidays

Other Decks in Programming

Featured

Transcript

© 2023 New Relic, Inc. All rights reserved How we

© 2023 New Relic, Inc. All rights reserved Collect Applications,

© 2023 New Relic, Inc. All rights reserved. Motivation How

© 2023 New Relic, Inc. All rights reserved. Motivation What

© 2023 New Relic, Inc. All rights reserved. Motivation Peak

© 2023 New Relic, Inc. All rights reserved. Grok has

© 2023 New Relic, Inc. All rights reserved. How Grok

© 2023 New Relic, Inc. All rights reserved. How Grok

© 2023 New Relic, Inc. All rights reserved. How Grok

© 2023 New Relic, Inc. All rights reserved. Challenges &

© 2023 New Relic, Inc. All rights reserved. Natural Language

© 2023 New Relic, Inc. All rights reserved. Natural Language

© 2023 New Relic, Inc. All rights reserved. Generic Knowledge

© 2023 New Relic, Inc. All rights reserved. What questions

© 2023 New Relic, Inc. All rights reserved. Finding right

© 2023 New Relic, Inc. All rights reserved. Performance ▪

© 2023 New Relic, Inc. All rights reserved. Microsoft Azure

© 2023 New Relic, Inc. All rights reserved. What’s next

© 2023 New Relic, Inc. All rights reserved. Peter Marelas

© 2023 New Relic, Inc. All rights reserved. Deprecation of

© 2023 New Relic, Inc. All rights reserved. LLM Rate

© 2023 New Relic, Inc. All rights reserved. LLM Context

© 2023 New Relic, Inc. All rights reserved. LLM have

© 2023 New Relic, Inc. All rights reserved. How are

© 2023 New Relic, Inc. All rights reserved. LLM What