LLM in enterprise market

Slide 1

Slide 1 text

Slide 2

Slide 2 text

©2022 Databricks Inc. — All rights reserved 2 Victor van den Broek - Databricks since 2023 - Focus on Dutch & Belgian public sector and Financial services - 15 years of ‘data experience’ as DE, DS, DA, PO… - linkedin.com/in/victorvdb Solutions Architect @ Databricks

Slide 3

Slide 3 text

©2023 Databricks Inc. — All rights reserved $3B in investment 5000+ global employees $1B+ in revenue Inventor and pioneer of the data lakehouse Gartner-recognized Leader Database Management Systems Data Science and Machine Learning Platforms The Lakehouse Company Creator of

Slide 4

Slide 4 text

Slide 5

Slide 5 text

©2022 Databricks Inc. — All rights reserved Generative AI, LLMs and Foundation Models 5 Artificial Intelligence (AI) Multidisciplinary field of computer science that aims to create systems capable emulating human intelligence Machine Learning (ML) Learn from existing data and make predictions without being explicitly programmed Deep Learning (DL) Use artificial neural networks to learn from data Generative AI Subfield of AI focussing on generating new data (images, text, audio, code, ...) LLM Models trained on massive datasets to achieve advanced language processing capabilities Foundation Models (GPT-4, BARD, MPT-7B, …) LLMs which can serve as the base for a wide range or applications

Slide 6

Slide 6 text

©2022 Databricks Inc. — All rights reserved LLMs are not that new Why should I care now? Accuracy and effectiveness has hit a tipping point • Many new use cases are unlocked! • Accessible by all. Readily available data and tooling • Large datasets. • Open-sourced model options. • Requires powerful GPUs, but are available on the cloud.

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Use Existing Model or Build Your Own Model Serving and Monitoring Data Collection and Preparation DATA PLATFORM UNITY CATALOG Datasets Models Applications

Slide 14

Slide 14 text

©2022 Databricks Inc. — All rights reserved 14 LLM level 0 Plain foundational models “Everyone” has done this - go to ChatGPT and ask questions without much engineering. Typical enterprise use cases: - Text summarization - Text classiﬁcation - Generic coding assistants

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Slide 17

Slide 17 text

Use Existing Model or Build Your Own Model Serving and Monitoring Data Collection and Preparation DATA PLATFORM UNITY CATALOG Datasets Models Applications Curated AI Models Model Serving optimized for LLMs MLflow AI Gateway Plain LLM Lakehouse Monitoring

Slide 18

Slide 18 text

©2022 Databricks Inc. — All rights reserved 18 LLM level 1 Prompt engineering Add contextual information in the prompt, to give the model speciﬁc information pertaining to the question. Typical enterprise use cases: - Customer service chatbots - Speciﬁc coding assistants

Slide 19

Slide 19 text

Slide 20

Slide 20 text

Slide 21

Slide 21 text

©2022 Databricks Inc. — All rights reserved 21 LLM level 2 Fine tuning Using data you have available, you can fine tune LLMs to fit your use case. Depending on whether they are open-source or closed source, the methodology will differ. Regardless, it will require data specific to your use case, and engineering capabilities - humans and hardware! Typical enterprise use cases: - LLM fine tuned to answer questions in a specialist area (e.g. legal, medical)

Slide 22

Slide 22 text

Use Existing Model or Build Your Own Model Serving and Monitoring Data Collection and Preparation DATA PLATFORM UNITY CATALOG Datasets Models Applications Feature Serving Curated AI Models AutoML for LLM training Model Serving optimized for LLMs Lakehouse Monitoring MLflow AI Gateway Mlflow Evaluation Plain LLM Simple prompt engineering Fine tuning

Slide 23

Slide 23 text

©2022 Databricks Inc. — All rights reserved 23 LLM level 3 Retrieval Augmented Generation Encode all relevant data you have with an LLM to a vector database. Then, retrieve the most relevant data and ingest them into the prompts. Basically prompt engineering on steroids, but requires you to encode all the data you have already, and keep using that LLM to encode questions as well. Typical enterprise use cases: - LLM answering about speciﬁcs in documents, such as purchase orders and contracts

Slide 24

Slide 24 text

Use Existing Model or Build Your Own Model Serving and Monitoring Data Collection and Preparation DATA PLATFORM UNITY CATALOG Datasets Models Applications Vector Search Feature Serving Curated AI Models AutoML for LLM training Model Serving optimized for LLMs Lakehouse Monitoring MLflow AI Gateway Mlflow Evaluation Plain LLM Simple prompt engineering Fine tuning Retrieval Augmented Generation

Slide 25

Slide 25 text

Slide 26

Slide 26 text

Slide 27

Slide 27 text

Slide 28

Slide 28 text

©2022 Databricks Inc. — All rights reserved 28 LLM level 4 Training your own model from 0 If all else fails, or you have speciﬁc governance / IP / risk requirements, then training a model from scratch becomes an option. However this is both very difﬁcult and very expensive, and there are currently very few enterprise use cases in which this is the solution. If you are one of them, you will know ;-)

Slide 29

Slide 29 text

Slide 30

Slide 30 text

Generative AI Fundamentals Course Earn your badge today and share your accomplishment on LinkedIn or résumé Build foundational knowledge of generative AI, including large language models (LLMs), with this free training course. ➔ Welcome and Introduction to the Course ➔ Introducing Generative AI ➔ Finding Success With Generative AI ➔ Assessing Potential Risks and Challenges Available on Databricks.com

Slide 31

Slide 31 text