Hallucination-free zone:
LLMs + Graph Databases got your back!
Photo by fabio on Unsplash
Jennifer Reif
jennifer.reif@neo4j.com
@JMHReif
github.com/JMHReif
jmhreif.com
linkedin.com/in/jmhreif
Slide 2
Slide 2 text
Who is Jennifer Reif?
Developer Advocate, Neo4j
• Continuous learner
• Conference speaker
• Tech blogger
• Other: geek
Jennifer Reif
jennifer.reif@neo4j.com
@JMHReif
github.com/JMHReif
jmhreif.com
linkedin.com/in/jmhreif
Slide 3
Slide 3 text
What is a hallucination?
Slide 4
Slide 4 text
Generative Arti
fi
cial Intelligence
Artificial intelligence capable of generating text, images, or other data
using generative models, often in response to prompts.
Generative AI models learn the patterns and structure of their input
training data and then generate new data that has similar
characteristics.
https://en.wikipedia.org/wiki/Generative_arti
fi
cial_intelligence
Slide 5
Slide 5 text
How do hallucinations happen?
LLM limitations
• Lacking most recent data
• Not always natural language
• Language complexities, sarcasm, emotion
• No sources
• Hallucinations / Temperature
• IP, bias, privacy
Slide 6
Slide 6 text
“Simply add an LLM” doesn’t
work…
Slide 7
Slide 7 text
Strategies to improve LLM accuracy
• Custom model
• Fine-tuning / Few-shot learning
• Retrieval Augmented Generation (RAG)
• All of these involve training an LLM on speci
fi
c data!
Slide 8
Slide 8 text
RAG
Pull data from external data sources
• Retrieval
• Data retrieved from database
• Augmented
• Augments response with facts
• Generation
• Response in natural language
Vector databases
Example: Pinecone, 2019 (not the
fi
rst)
• Pros:
• Fast retrieval of highly-dimensional data
• Similarity searches based on vectors (not
speci
fi
c values)
• Cons:
• Requires a lot of infrastructure and power
• Cannot store much outside vector+metadata
Slide 11
Slide 11 text
Graph databases
Example: Neo4j, 2007
• Pros:
• Flexible, agile data model
• Relationships stored with entities (JOIN
operations visual)
• Cons:
• Storing relationships creates some write
overhead
• No/Little bene
fi
t for low-connected data
Slide 12
Slide 12 text
What is a graph?
Slide 13
Slide 13 text
What is a graph?
Leonard Euler - graph theory
• Started with a math problem
• Looking for a “better way” to handle certain problems
Seven Bridges of Konigsberg problem. Leonhard Euler, 1735
Slide 14
Slide 14 text
What does it solve?
Data problems
• Documented path (not just data)
• Answering how and why
• Understanding/
fi
nd hidden connections
• Find alternates, impacts, etc.
• Graphs add context + meaning
Slide 15
Slide 15 text
Use cases
• Recommendations
• Social
• Supply chain
• Fraud detection
• GenAI (grounding/RAG)
• Many more!
Slide 16
Slide 16 text
Data storage with relationships!
TL;DR
• Stores relationships with entities
• Produces faster read queries (for JOINs)
• Easily connect multiple entities together
• Mimic real-world data organization
Slide 17
Slide 17 text
Let’s build one…
Slide 18
Slide 18 text
Book domain
• Find authors with reviews for multiple books
• Find similar users based on reviews of books and related authors
Nodes
• Represent objects or entities
• Can be labeled
• May have properties
Book Author
title: “Star Wars”
isbn: 9756165498
name: “George Lucas”
avgRating: 4.72
Review
rating: 4.2
reviewText: “Blah”
votes: 17
Slide 21
Slide 21 text
Relationships
• Must have a type (label)
• Must have a direction
• May have properties
Book Author
title: “Star Wars”
isbn: 9756165498
name: “George Lucas”
avgRating: 4.72
Review
rating: 4.2
reviewText: “Blah”
votes: 17
AUTHORED
WRITTEN_FOR
date_added:
“Sun Jan 03”
Slide 22
Slide 22 text
Applying RAG to an LLM
Slide 23
Slide 23 text
LLMs take text, not databases
And context window limit
Slide 24
Slide 24 text
What is a vector?
Mathematical realm
• Line in space
• Has length and direction
horizontal
vertical
Slide 25
Slide 25 text
Vectors in the physical realm
https://www.mathsisfun.com/algebra/vectors.html
Slide 26
Slide 26 text
Vector arithmetic
C = a + b
1
a
b
2
a
b
3
a + b
Slide 27
Slide 27 text
Vectors in the technical realm
Kings and Queens
king − man + woman ≈ queen
king
man wom
an
1
king
man
wom
an
2
queen?
3
Slide 28
Slide 28 text
Vectors to compare things
What makes things similar?
Length
Width
Slide 29
Slide 29 text
Embeddings
Convert data to a point in space
• Series of numbers
• 100s or 1000s of dimensions
• Dimension = interesting feature / characteristic
Slide 30
Slide 30 text
LLMs take text, not databases
Vectors
Slide 31
Slide 31 text
How do we search the vectors?
Similarity search
• Expensive queries (compare to every vector)
• Approximate nearest neighbor (k-ANN)
• Example: Library
• Book classi
fi
cation - genre vs location of plot
• Smaller search set = smaller retrieval time!
Photo by Martin Adams on Unsplash
Slide 32
Slide 32 text
Provide prompt + context to LLM
Vectors Text
Slide 33
Slide 33 text
RAG architecture
• Retrieval
• Data retrieved from database
• Augmented
• Augments response with facts
• Generation
• Response in natural language
Prompt + Relevant Information
LLM API LLM
Chat API
User
Database Search
Prompt Response
Relevant Results
/ Documents
2
3
1
Database
Slide 34
Slide 34 text
Agentic Workflow
Architecture
• Uses “agents”/tools
• LLM determines next step
• Which tool/external source should be called
• Uses result from tool as context
Prompt + Relevant Information
LLM API LLM
Chat API
User
Tool
Prompt Response
Relevant Results
/ Documents
2
3
1
Source info
Slide 35
Slide 35 text
Nothing is a silver bullet
LLM is (of sorts) mind of its own
• Can’t guarantee a consistent answer
• Prompt engineering
• Context window limits
Slide 36
Slide 36 text
How much value can RAG add?
Slide 37
Slide 37 text
Explainable AI
With RAG + LLM
• How did the LLM get this answer?
• Graphs:
• Generate LLM response from knowledge in database
• Set security rules in graph (what’s viewable)
• Retrieve extra data connected to similar entities
Photo by No Revisions on Unsplash
Slide 38
Slide 38 text
Demo!
Our data model
Slide 39
Slide 39 text
No content
Slide 40
Slide 40 text
No content
Slide 41
Slide 41 text
Resources
• Github repository (today’s code): github.com/JMHReif/springai-goodreads
• GraphAcademy LLM courses: graphacademy.neo4j.com/categories/llms/
• Docs for Spring AI: docs.spring.io/spring-ai/reference/api/vectordbs/neo4j.html
• NODES 2024: neo4j.com/nodes2024/agenda
Jennifer Reif
jennifer.reif@neo4j.com
@JMHReif
github.com/JMHReif
jmhreif.com
linkedin.com/in/jmhreif