Q&A Session: Graphing IATA Activity Data for LLM-Powered Chatbots

Jennifer Reif [email protected] @JMHReif github.com/JMHReif jmhreif.com linkedin.com/in/jmhreif Neo4j Q&A Session:
Graphing IATI Aid Activity Data for LLM-Powered Chatbots Photo by Igor Omilaev on Unsplash

Who Am I? Neo4j Developer Advocate • Java/JVM technologies •
Conference speaker • Technical blog writer • All-around geek

Using LLMs BEWARE: hallucinations! • Send user question to LLM
• Positives: • Natural language response • Broad knowledge (crawl the internet) • Negatives: • Might not have latest data • Can hallucinate when unsure

IATA Format, options, etc • Need subscription (free tier available)
• Single data sets very fl at, non-graph • API provides most value • API has learning curve (query params) • XML, JSON, or CSV responses <iati-activity last-updated- datetime="2023-11-20T07:19:47+02:00" xml:lang="en" default-currency="USD" humanitarian="0" hierarchy="1"> <iati-identifier>SE-0-SE-6-7600051401- BIH-16061</iati-identifier> <reporting-org ref="SE-0" type="10" secondary-reporter="0"> <narrative>Sweden</narrative> </reporting-org> <title> <narrative>Music high schools Sw-BiH</ narrative> <narrative xml:lang="sv">Musikhögskolor Sw-BiH</narrative> </title> <description type="1"> <narrative>Cooperation between the Music Academy in Sarajevo and the Royal Music High School of Sthlm, the support concerns a visit with the aim to plan the cooperation during 1999.</narrative> <narrative xml:lang="sv">Samarbete mellan Kungliga Musikhögskolan i Stockholm och Musikakademini Sarajevo, bidraget avser främst en planeringsresa för samarbetet under 1999.</narrative> </description> https://developer.iatistandard.org/api-details#api=datastore&operation=query

Neo4j Schema-free • Schema- fl exible • Makes refactoring easier
and faster • Queries with Cypher • APOC utility library, for the win! • Construct Cypher statements to create data according to your model

Import Headers = 🤕 • Cloud dbaas (Aura) blocks procs
accepting headers • Local dbs or alternate hosting required (not Aura) • APOC = 🛟 • apoc.load.xml(url, ‘’, {headers: {abc: blah, def: blah2}}) • apoc.load.jsonParams(url, {abc: blah, def: blah2}, null) https://neo4j.com/docs/apoc/current/overview/apoc.load/

Data Import Draft the data model • Nodes: Activity, Organization,
Sector, …? • Relationships - this is the value!

Import statement outline • Construct the URL and headers •
UNWIND list of activities (and related properties) • Create (MERGE) each activity node • UNWIND list of organizations • Create (MERGE) each org node • Create relationship: Activity<-Organization • UNWIND list of sectors • Create (MERGE) each sector node • Create relationship: Activity->Sector

WITH url CALL apoc.load.xml(url,"", {headers:{<headers>}}) YIELD value … UNWIND activities
as activity WITH activity, activity._children as details … UNWIND titles, descriptions … CALL apoc.merge.node(["Activity"], {id: id._text}, {title: title._text, description: descr._text}) YIELD node as actNode WITH details, actNode UNWIND orgs as org … CALL apoc.merge.node(["Organization"], {ref: org.ref}, {name: name._text}) YIELD node as partOrg … MERGE (actNode)<-[r:PARTICIPATES_IN]-(partOrg) … UNWIND sectors as sector … CALL apoc.merge.node(["Sector"], {code: sector.code}, {name: name._text}) YIELD node as secNode … MERGE (actNode)<-[r2:OCCURS_IN]->(secNode) RETURN * https://neo4j.com/labs/apoc/4.1/overview/apoc.merge/apoc.merge.node/

RAG Retrieval Augmented Generation • Can incorporate recent data •
Reduce hallucinations • Use LLM to style results as natural language • Option: use Neo4j as a vector database • Vector indexes available • Possible chunking for large amounts of text • Retrieve entities based on similarity search • Prompt engineering may or may not help further

Chatbot architecture Courtesy: Tomaz Bratanic https://neo4j.com/developer-blog/knowledge-graph-based-chatbot-with-gpt-3-and-neo4j/

Chatbot architecture Courtesy: Tomaz Bratanic https://neo4j.com/developer-blog/context-aware-knowledge-graph-chatbot-with-gpt-4-and-neo4j/

RAG steps (with vector) User->LLM->Neo4j->LLM->User • Put data into Neo4j
• Create embeddings (OpenAI or other model) • Save those embeddings as vectors in Neo4j • LLM creates vector for user question • Use Neo4j vector index to fi nd similar documents • Returns similar documents to user https://neo4j.com/developer-blog/building-educational-chatbot-neo4j/

Resources • Neo4j GraphAcademy: self-paced courses (2 new LLM courses!)
• IATA: playground • Neo4j APOC: load xml • Embeddings: OpenAI docs • Neo4j Vectors: search index docs • Chatbot examples: • Blog post: Knowledge-graph based chatbot • Blog post: Context-aware chatbot • Blog post: Educational chatbot Jennifer Reif [email protected] @JMHReif github.com/JMHReif jmhreif.com linkedin.com/in/jmhreif

Q&A Session: Graphing IATA Activity Data for LL...

Q&A Session: Graphing IATA Activity Data for LLM-Powered Chatbots

Jennifer Reif

More Decks by Jennifer Reif

Other Decks in Technology

Featured

Transcript

Jennifer Reif [email protected] @JMHReif github.com/JMHReif jmhreif.com linkedin.com/in/jmhreif Neo4j Q&A Session:

Who Am I? Neo4j Developer Advocate • Java/JVM technologies •

Using LLMs BEWARE: hallucinations! • Send user question to LLM

IATA Format, options, etc • Need subscription (free tier available)

Neo4j Schema-free • Schema- fl exible • Makes refactoring easier

Import Headers = 🤕 • Cloud dbaas (Aura) blocks procs

Data Import Draft the data model • Nodes: Activity, Organization,

Import statement outline • Construct the URL and headers •

WITH url CALL apoc.load.xml(url,"", {headers:{<headers>}}) YIELD value … UNWIND activities

RAG Retrieval Augmented Generation • Can incorporate recent data •

Chatbot architecture Courtesy: Tomaz Bratanic https://neo4j.com/developer-blog/knowledge-graph-based-chatbot-with-gpt-3-and-neo4j/

Chatbot architecture Courtesy: Tomaz Bratanic https://neo4j.com/developer-blog/context-aware-knowledge-graph-chatbot-with-gpt-4-and-neo4j/

RAG steps (with vector) User->LLM->Neo4j->LLM->User • Put data into Neo4j

Resources • Neo4j GraphAcademy: self-paced courses (2 new LLM courses!)