$30 off During Our Annual Pro Sale. View Details »

LLM in enterprise market

Marketing OGZ
PRO
September 15, 2023
60

LLM in enterprise market

Marketing OGZ
PRO

September 15, 2023
Tweet

Transcript

  1. ©2022 Databricks Inc. — All rights reserved
    LLMs in the
    enterprise
    market
    1
    Victor van den Broek
    Big Data Expo Utrecht - 12 september 2023

    View Slide

  2. ©2022 Databricks Inc. — All rights reserved 2
    Victor van den Broek
    - Databricks since 2023
    - Focus on Dutch & Belgian public
    sector and Financial services
    - 15 years of ‘data experience’ as
    DE, DS, DA, PO…
    - linkedin.com/in/victorvdb
    Solutions Architect @ Databricks

    View Slide

  3. ©2023 Databricks Inc. — All rights reserved
    $3B
    in investment
    5000+
    global employees
    $1B+
    in revenue
    Inventor and pioneer
    of the data lakehouse
    Gartner-recognized Leader
    Database Management Systems
    Data Science and Machine Learning Platforms
    The Lakehouse Company
    Creator of

    View Slide

  4. ©2022 Databricks Inc. — All rights reserved 4
    LLM 101

    View Slide

  5. ©2022 Databricks Inc. — All rights reserved
    Generative AI, LLMs and Foundation Models
    5
    Artificial Intelligence (AI)
    Multidisciplinary field of computer science that aims to create systems capable emulating human intelligence
    Machine Learning (ML)
    Learn from existing data and make predictions without being explicitly programmed
    Deep Learning (DL)
    Use artificial neural networks to learn from data
    Generative AI
    Subfield of AI focussing on generating new data (images, text, audio, code, ...)
    LLM
    Models trained on massive datasets to achieve advanced language processing capabilities
    Foundation Models (GPT-4, BARD, MPT-7B, …)
    LLMs which can serve as the base for a wide range or applications

    View Slide

  6. ©2022 Databricks Inc. — All rights reserved
    LLMs are not that new
    Why should I care now?
    Accuracy and effectiveness has hit
    a tipping point
    • Many new use cases are unlocked!
    • Accessible by all.
    Readily available data and tooling
    • Large datasets.
    • Open-sourced model options.
    • Requires powerful GPUs, but are available
    on the cloud.

    View Slide

  7. ©2022 Databricks Inc. — All rights reserved
    Machine Translation Text Summarization Chatbots &
    Conversational Interfaces
    Language Models are everywhere…

    View Slide

  8. ©2022 Databricks Inc. — All rights reserved
    What is a language model?
    Finds the most likely next word in a sequence
    Avocados are …
    Stochastic Parrot
    Green Fruit Delicious Luxurious

    View Slide

  9. ©2022 Databricks Inc. — All rights reserved
    I might read about 700 books in my lifetime

    View Slide

  10. ©2022 Databricks Inc. — All rights reserved
    A Large LM may be trained on 10.000.000 book equivalents

    View Slide

  11. ©2022 Databricks Inc. — All rights reserved 11
    - Faster software development
    - More users can leverage AI
    - More use cases
    - Reduce development cost
    - Reduce monotonous tasks

    View Slide

  12. ©2022 Databricks Inc. — All rights reserved 12
    How do you get to an enterprise
    deployment?

    View Slide

  13. Use Existing Model or
    Build Your Own
    Model Serving and
    Monitoring
    Data Collection and
    Preparation
    DATA PLATFORM
    UNITY CATALOG
    Datasets Models Applications

    View Slide

  14. ©2022 Databricks Inc. — All rights reserved 14
    LLM level 0
    Plain foundational models
    “Everyone” has done this - go to ChatGPT and ask questions without much
    engineering.
    Typical enterprise use cases:
    - Text summarization
    - Text classification
    - Generic coding assistants

    View Slide

  15. ©2022 Databricks Inc. — All rights reserved

    View Slide

  16. ©2022 Databricks Inc. — All rights reserved
    Alternative to Azure OpenAI
    are Open Source Models
    16

    View Slide

  17. Use Existing Model or
    Build Your Own
    Model Serving and
    Monitoring
    Data Collection and
    Preparation
    DATA PLATFORM
    UNITY CATALOG
    Datasets Models Applications
    Curated
    AI Models
    Model Serving
    optimized for LLMs
    MLflow AI Gateway
    Plain LLM
    Lakehouse
    Monitoring

    View Slide

  18. ©2022 Databricks Inc. — All rights reserved 18
    LLM level 1
    Prompt engineering
    Add contextual information in the prompt, to give the model specific
    information pertaining to the question.
    Typical enterprise use cases:
    - Customer service chatbots
    - Specific coding assistants

    View Slide

  19. ©2022 Databricks Inc. — All rights reserved 19

    View Slide

  20. Use Existing Model or
    Build Your Own
    Model Serving and
    Monitoring
    Data Collection and
    Preparation
    DATA PLATFORM
    UNITY CATALOG
    Datasets Models Applications
    Curated
    AI Models
    Model Serving
    optimized for LLMs
    Lakehouse
    Monitoring
    MLflow AI Gateway
    Feature
    Serving
    Mlflow Evaluation
    Plain LLM Simple prompt engineering

    View Slide

  21. ©2022 Databricks Inc. — All rights reserved 21
    LLM level 2
    Fine tuning
    Using data you have available, you can fine tune LLMs to fit your use case.
    Depending on whether they are open-source or closed source, the
    methodology will differ.
    Regardless, it will require data specific to your use case, and engineering
    capabilities - humans and hardware!
    Typical enterprise use cases:
    - LLM fine tuned to answer questions in a specialist area (e.g. legal,
    medical)

    View Slide

  22. Use Existing Model or
    Build Your Own
    Model Serving and
    Monitoring
    Data Collection and
    Preparation
    DATA PLATFORM
    UNITY CATALOG
    Datasets Models Applications
    Feature
    Serving
    Curated
    AI Models
    AutoML for
    LLM training
    Model Serving
    optimized for LLMs
    Lakehouse
    Monitoring
    MLflow AI Gateway
    Mlflow Evaluation
    Plain LLM Simple prompt engineering
    Fine tuning

    View Slide

  23. ©2022 Databricks Inc. — All rights reserved 23
    LLM level 3
    Retrieval Augmented Generation
    Encode all relevant data you have with an LLM to a vector database. Then,
    retrieve the most relevant data and ingest them into the prompts.
    Basically prompt engineering on steroids, but requires you to encode all the
    data you have already, and keep using that LLM to encode questions as
    well.
    Typical enterprise use cases:
    - LLM answering about specifics in documents, such as purchase orders
    and contracts

    View Slide

  24. Use Existing Model or
    Build Your Own
    Model Serving and
    Monitoring
    Data Collection and
    Preparation
    DATA PLATFORM
    UNITY CATALOG
    Datasets Models Applications
    Vector
    Search
    Feature
    Serving
    Curated
    AI Models
    AutoML for
    LLM training
    Model Serving
    optimized for LLMs
    Lakehouse
    Monitoring
    MLflow AI Gateway
    Mlflow Evaluation
    Plain LLM Simple prompt engineering
    Fine tuning
    Retrieval Augmented Generation

    View Slide

  25. ©2022 Databricks Inc. — All rights reserved 25
    RAG vs Fine-Tuning
    Generic answers with specific knowledge vs specific answers

    View Slide

  26. ©2022 Databricks Inc. — All rights reserved 26
    RAG vs Fine-Tuning
    Generic answers with specific knowledge vs specific answers

    View Slide

  27. ©2022 Databricks Inc. — All rights reserved 27
    RAG vs Fine-Tuning
    Generic answers with specific knowledge vs specific answers

    View Slide

  28. ©2022 Databricks Inc. — All rights reserved 28
    LLM level 4
    Training your own model from 0
    If all else fails, or you have specific governance / IP / risk requirements, then
    training a model from scratch becomes an option.
    However this is both very difficult and very expensive, and there are
    currently very few enterprise use cases in which this is the solution.
    If you are one of them, you will know ;-)

    View Slide

  29. Use Existing Model or
    Build Your Own
    Model Serving and
    Monitoring
    Data Collection and
    Preparation
    DATA PLATFORM
    UNITY CATALOG
    Datasets Models Applications
    Vector
    Search
    Feature
    Serving
    Curated
    AI Models
    AutoML for
    LLM training
    Model Serving
    optimized for LLMs
    Lakehouse
    Monitoring
    MLflow AI Gateway
    Mlflow Evaluation
    Plain LLM Simple prompt engineering
    Fine tuning
    Retrieval Augmented Generation
    Training from scratch

    View Slide

  30. Generative AI Fundamentals Course
    Earn your badge today and share your accomplishment on LinkedIn or
    résumé
    Build foundational knowledge of generative AI,
    including large language models (LLMs), with this
    free training course.
    ➔ Welcome and Introduction to the Course
    ➔ Introducing Generative AI
    ➔ Finding Success With Generative AI
    ➔ Assessing Potential Risks and Challenges
    Available on Databricks.com

    View Slide

  31. ©2023 Databricks Inc. — All rights reserved
    23 November | Beurs van Berlage
    Register now
    31

    View Slide