Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hallucination-Free Zone: LLMs + Graph Databases got your back!

Hallucination-Free Zone: LLMs + Graph Databases got your back!

Hallucinations refer to the generation of contextually plausible but incorrect or fabricated information, demonstrating the model's capacity to produce imaginative and contextually coherent yet inaccurate outputs.
Large Language Models (LLMs) can provide answers that sound realistic to almost any question, even if those answers are entirely made up. With a Graph Database, you can anchor an LLM in reality and mitigate the risk of generating false information or unauthorized access to sensitive data. This prevents the model from producing inaccurate responses and ensures a more reliable and secure outcome.
A graph database uses graph structures with nodes, edges, and properties to represent and store data, facilitating efficient querying and analysis of relationships in interconnected datasets, commonly used for applications such as knowledge graphs, fraud detections, supply.
This presentation will show you the benefits of graph databases over regular databases and how to use AI tools to eliminate LLM hallucinations, enforce security, and improve accuracy. We will also discuss why a vector index can provides better, smarter, faster results than a pure vector database.

Jennifer Reif

February 22, 2024
Tweet

More Decks by Jennifer Reif

Other Decks in Technology

Transcript

  1. Hallucination-free zone: LLMs + Graph Databases got your back! Photo

    by fabio on Unsplash Jennifer Reif [email protected] @JMHReif github.com/JMHReif jmhreif.com linkedin.com/in/jmhreif
  2. Who is Jennifer Reif? Developer Advocate, Neo4j • Continuous learner

    • Conference speaker • Tech blogger • Other: geek Jennifer Reif [email protected] @JMHReif github.com/JMHReif jmhreif.com linkedin.com/in/jmhreif
  3. Relational databases (RDBMS) Example: Oracle, 1979 • Pros: • Slices

    of data easily assembled with queries • Low data duplication (unique rows) • Cons: • Relationships assembled with JOINS (NF) • Strict model
  4. Document databases Example: MongoDB, 2007 • Pros: • Handles varied

    (unstructured) data more easily • Group related info in a single entity • Cons: • Little fl exibility (relationships pre-baked in doc) • Data duplication "customer" : { "id": "123", "firstName" : "Jane", "lastName" : "Doe", "DOB" : "03/12/1989", "department" : "Engineering", "phoneNumbers" : { { "type" : "office", "number" : "650-123-4567" } { "type" : "cell", "number" : "650-321-7654" } } "title" : "Director of QA" }
  5. Vector databases Example: Pinecone, 2019 (not the fi rst) •

    Pros: • Fast retrieval of highly-dimensional data • Similarity searches based on vectors (not speci fi c values) • Cons: • Requires a lot of infrastructure and power • Cannot store much outside vector+metadata
  6. Graph databases Example: Neo4j, 2007 • Pros: • Flexible, agile

    data model • Relationships stored with entities (JOIN operations visual) • Cons: • Storing relationships creates some write overhead • No bene fi t over relational for low-connected data
  7. What is a graph? Leonard Euler - graph theory •

    Started with a math problem • Looking for a “better way” to handle certain problems Seven Bridges of Konigsberg problem. Leonhard Euler, 1735
  8. What does it solve? Data problems • Documented path (not

    just data) • Answering how and why • Understanding/ fi nd hidden connections • Find alternates, impacts, etc. • Graphs add context + meaning
  9. Use cases • Recommendations • Social • Supply chain •

    Fraud detection • GenAI (grounding/RAG) • Many more!
  10. Data storage with relationships! TL;DR • Stores relationships with entities

    • Produces faster read queries (for JOINs) • Easily connect multiple entities together • Mimic real-world data organization
  11. Book domain • Find authors with reviews for multiple books

    • Find similar users based on reviews of books and related authors
  12. Nodes • Represent objects or entities • Can be labeled

    • May have properties Book Author title: “Star Wars” isbn: 9756165498 name: “George Lucas” avgRating: 4.72 Review rating: 4.2 reviewText: “Blah” votes: 17
  13. Relationships • Must have a type (label) • Must have

    a direction • May have properties Book Author title: “Star Wars” isbn: 9756165498 name: “George Lucas” avgRating: 4.72 Review rating: 4.2 reviewText: “Blah” votes: 17 AUTHORED WRITTEN_FOR date_added: “Sun Jan 03”
  14. Example Kings and Queens king − man + woman ≈

    queen king man wom an 1 king man wom an 2 queen? 3
  15. What are vector embeddings? Convert something to a point in

    space • Same concepts, applied to data formats • 100s or 1000s of dimensions • Dimension = interesting feature/characteristic
  16. Vector index Why index? • Queries become expensive • Need

    to compare every vector to query • Indexes = speed • Jump right to where you need (like index in a book) • Approximate nearest neighbor (k-ANN) • e.g. 20 closest vectors to this one
  17. Vector index Example: Library • Categorizing books by author or

    genre • Embeddings can hold more complex information • Further categories: • “gender of main character” • “main location of plot” • Indexing can retrieve a smaller portion of all available vectors • Reducing retrieval time!
  18. Neo4j Vector Search What’s the value? • Allow to store

    structured + unstructured data side-by-side • Other vector dbs only store unstructured data • Power is in the connected entities to the vector search results • Connected = extra, relevant context Combine for more accurate results within a relevant context. Knowledge Graph Similarity Search Find similar documents. Vector Index Find related information. Graph Structure Pattern Matching
  19. Data + Vectors How do I get vectors for existing

    data? • Generate some vector embeddings • Happens externally, several models available • Store embedding as property on a node
  20. RAG Pull data from external data source • Retrieval •

    Data retrieved from database • Augmented • Augments response with facts • Generation • Response in natural language Prompt + Relevant Information LLM API LLM
 Chat API User Database Search Prompt Response Relevant Results / Documents 2 3 1 Database
  21. Explainable AI With RAG + LLM • How did the

    LLM get this answer? • Graphs: • Generate LLM response from knowledge in database • Set security rules in graph (what’s viewable) • Retrieve extra data connected to similar entities
  22. Resources • Github repository (today’s code): github.com/JMHReif/springai-goodreads • GraphAcademy LLM

    courses: graphacademy.neo4j.com/categories/llms/ • Docs for Spring AI: docs.spring.io/spring-ai/reference/api/vectordbs/neo4j.html • Docs for OpenAI embeddings: platform.openai.com/docs/guides/embeddings Jennifer Reif [email protected] @JMHReif github.com/JMHReif jmhreif.com linkedin.com/in/jmhreif