Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pass or Play: What does GenAI mean for the Java...

Pass or Play: What does GenAI mean for the Java developer?

GenAI, LLM, and other buzzwords are everywhere. The sea of acronyms can feel like (and sometimes actually are) a black box for the complex logic and processes that underpin them. Does/should a Java developer care? In this session, we’ll explore how these technologies operate and cover many of the technical terms that go along with them, such as hallucinations, grounding, and more. We will understand the abilities GenAI can provide to technical solutions alongside some of the struggles they bring, as well. Live-code examples will show how Java developers can utilize GenAI and help determine whether they are worth the hype. Come see if you should pass or play on GenAI.

Accompanying code: https://github.com/JMHReif/springai-goodreads

Jennifer Reif

May 14, 2024
Tweet

More Decks by Jennifer Reif

Other Decks in Technology

Transcript

  1. Pass or Play What does GenAI mean for the Java

    developer? Jennifer Reif [email protected] @JMHReif github.com/JMHReif jmhreif.com linkedin.com/in/jmhreif
  2. Who is Jennifer Reif? Developer Advocate, Neo4j • Continuous learner

    • Conference speaker • Tech blogger • Other: geek Jennifer Reif [email protected] @JMHReif github.com/JMHReif jmhreif.com linkedin.com/in/jmhreif
  3. Photo by Matt Walsh on Unsplash AI Vector RAG LLM

    Algorithm Chaining DS Entity resolution Knowledge graph ML NLP GenAI Hallucination Embedding k-ANN Cosine similarity Euclidean distance Fine-tune Few-shot Grounding Model Prompt Semantic search Similarity Temperature Tokens Natural language ChatGPT Chatbot Context window
  4. Generative Arti fi cial Intelligence Artificial intelligence capable of generating

    text, images, or other data using generative models, often in response to prompts. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics. https://en.wikipedia.org/wiki/Generative_arti fi cial_intelligence
  5. General info… About LLMs • Lots of data • Answers

    on probabilities • Training takes tons of hardware, money, time • Models • Di ff erent providers / companies train their own
  6. ChatGPT Public interface to LLM • Chat Generative Pre-trained Transformer

    • OpenAI, Nov 2022 • Natural language response • Predict next word • Feedback / reward to rank responses • Use cases: professional, personal, everything!
  7. Is it all that? Worth the hype? • General information

    • Public domain knowledge • Historical data • Creative / arts • Human assistant • Task delegation Photo by Igor Omilaev on Unsplash
  8. LLM issues • Lacking most recent data • Not always

    natural language • Language complexities, sarcasm, emotion • No sources • Hallucinations / Temperature • IP, bias, privacy
  9. RAG Pull data from external data sources • Retrieval •

    Data retrieved from database • Augmented • Augments response with facts • Generation • Response in natural language Prompt + Relevant Information LLM API LLM
 Chat API User Database Search Prompt Response Relevant Results / Documents 2 3 1 Database
  10. Explainable AI With RAG + LLM • How did the

    LLM get this answer? • Grounding LLM answer with veri fi ed data Photo by No Revisions on Unsplash
  11. Example Kings and Queens king − man + woman ≈

    queen king man wom an 1 king man wom an 2 queen? 3
  12. Embeddings Convert data to a point in space • Series

    of numbers • 100s or 1000s of dimensions • Dimension = interesting feature / characteristic
  13. Similarity search Vector indexes • Expensive queries (compare to every

    vector) • Approximate nearest neighbor (k-ANN) • Example: Library • Book classi fi cation - author vs location of plot • Smaller search set = smaller retrieval time! Photo by Martin Adams on Unsplash
  14. Vector databases Example: Pinecone, 2019 (not fi rst) • Pros:

    • Fast retrieval of highly-dimensional data • Similarity searches based on vectors (not speci fi c values) • Cons: • Requires a lot of infrastructure and power • Cannot store much outside vector+metadata
  15. Graph databases Example: Neo4j, 2007 • Pros: • Flexible, agile

    data model • Relationships stored with entities (visual JOINs) • Cons: • Write overhead to store relationships • No bene fi t for low-connected data (relational)
  16. (Knowledge) Graphs in GenAI Bene fi ts • Generate LLM

    response from knowledge in database • Security rules in database (what’s viewable) • Structured + unstructured data side-by-side • Vector dbs only store unstructured • Retrieve connected entities • Connected = extra, relevant context Combine for more accurate results within a relevant context. Knowledge Graph Similarity Search Find similar documents. Vector Index Find related information. Graph Structure Pattern Matching
  17. Nothing is a silver bullet LLM is (of sorts) mind

    of its own • Can’t guarantee a consistent answer • Prompt engineering • Context window limits
  18. Resources • Github repository (today’s code): github.com/JMHReif/springai-goodreads • GraphAcademy LLM

    courses: graphacademy.neo4j.com/categories/llms/ • Docs for Spring AI: docs.spring.io/spring-ai/reference/api/vectordbs/neo4j.html • NODES 2024: dev.neo4j.com/nodes24 Jennifer Reif [email protected] @JMHReif github.com/JMHReif jmhreif.com linkedin.com/in/jmhreif