Generative AI

Vertrouwelijk Aangepast voor naam van het bedrijf Versie 1.0 Generative
AI Vereniging Informatici Defensie - October 5, 2023 Midjourney - "a symposium for ministry of defense staff, with drones in attendance, sci-fi movie style" Ivo Jansch [email protected] @ijansch

Vertrouwelijk Aangepast voor naam van het bedrijf Versie 1.0 DISCLAIMER
By the time we ﬁnish this talk it's probably already outdated Midjourney - "people attending a presentation in an auditorium. There's a time machine phonebooth from Dr Who in the corner"

Vertrouwelijk Aangepast voor naam van het bedrijf Versie 1.0 Chapter
1: How does Generative AI work? Midjourney - "a brain of a robot, full of neurons. Science fiction style"

2005 - 2015: Rise of Deep Learning Introducing layers into
the ﬁeld of machine learning: pixels grouped pixels edges green hues … a cat a cat staring a cat staring at you a cat at a waterhole in the jungle

2017: Invention of the Transformer Architecture Has nothing to do
with The Transformers Has everything to do with a new, scalable architecture that created the ability to process large quantities of data and most importantly: - Self supervised learning - Self attention

Understanding attention The cat drank from the waterhole until it
was full The cat drank from the waterhole until it was empty

2018: GPT Generative Predict the next word(s) from a sequence
of words Pre-trained Trained on a large corpus of text Transformer Built on the transformer architecture

Training the model Training Data 17 Gb for GPT-3 45
Gb for GPT-4 Language Model 175 billion parameters for GPT-3 1 trillion parameters for GPT-4

Training the model - word embeddings Cat Animal 0.3 Domestic
0.2 Hairy 0.25 Smelly 0.25 Food 0.3 PC 0.98 Mouse 0.3 other words encountered distance to those words

Training the model - word embeddings Cat Animal 0.3 Domestic
0.2 Hairy 0.25 Smelly 0.25 Food 0.3 PC 0.98 Mouse 0.3 Mouse Animal 0.3 Domestic 0.5 Hairy 0.25 Smelly 0.25 Food 0.7 PC 0.2 Cat 0.3 Dog Animal 0.3 Domestic 0.2 Hairy 0.25 Smelly 0.4 Food 0.25 PC 0.99 Mouse 0.7 PC Animal 0.98 Domestic 0.99 Hairy 0.7 Smelly 0.8 Food 0.8 Cat 0.98 Mouse 0.3

Word embeddings are useful for self-attention The cat drank from
the waterhole until it was full Cat Full

The model 'knows' relationships between words Cat Drink Pet Water
Fat Rat Dog PC Mouse Jungle Rhyme Vast

The model 'knows' relationships between words Cat Drink Pet Water
Fat Rat Dog PC Mouse Jungle Rhyme Vast These are the 'parameters'

Then it starts predicting Given a prompt, what is the
most likely word that comes next What is a cat? 96% 20% 94% Fat A pet An animal

Like humans, not always the same What is a cat?
96% 20% 94% Fat A pet An animal 'Temperature' introduces some randomness between similar answers

Word by word Given a prompt, what is the most
likely word that comes next Input: Why does a cat drink water? Prediction: A cat

Word by word by word Given a prompt, what is
the most likely word that comes next Input: Why does a cat drink water? A cat Prediction: needs

Word by word by word by word Given a prompt,
what is the most likely word that comes next Input: Why does a cat drink water? A cat needs Prediction: nutrition

Word by word by word by word Given a prompt,
what is the most likely word that comes next Done: Why does a cat drink water? A cat needs nutrition

Note: a model works with tokens, not words The cat
drank from the waterhole until it was full All tokens are converted to numbers. 17 1345 98 45 17 2624 213 21 78 723 A language model is essentially an abstract model of relationships between numbers.

GPT's are "stochastic parrots" - They "understand" - Yet, they
don't truly understand what they are talking about - They predict which numbers are most likely to follow other numbers, given the context of other numbers.

Disclaimer That was by no means a scientiﬁcally correct explanation.
But explains in a very simpliﬁed way how it works.

If it works for words, it can also work for
other things Music… Images… Film…

News this week:

2: Generative AI as assistant Midjourney - "a development team sitting around a table, everyone wearing AR glasses. Science fiction style"

Increase Coding Productivity

Coaching Junior Developers

Designing Data Models LLMs can help with creating data models
and normalization.

3: Generative AI application development Midjourney - "smart ai robots in a factory, performing tedious manual tasks for humans, science fiction style"

Application areas Enhancing Copy writing Gap ﬁlling Summarising Extracting Analysing
Translating Converting Copy editing Models are good at: But not so suitable for factﬁnding / data accuracy

Application Approach 1: Train your own model Training Huge Dataset
Custom LLM Prompt engineering Input Output

Application Approach 2: Model ﬁnetuning Training Additional Dataset Custom LLM
Existing LLM Prompt engineering Input Output

Application Approach 3: Vector databases Vector Database Existing LLM Prompt
engineering Additional Dataset Create Embeddings Input Output

Application Approach 4: LLMs as tool in a chain SQL
Query Additional Dataset Factual Results Summarize Existing LLM Translate to query Input Output

Example usecase "Can AI assist with answering parliamentary questions?"

Example use case "Can AI assist with answering parliamentary questions?"

Application Application Proof of concept: LLM: Predict Parliamentary Questions Historic
Parliamentary Questions Vector Database Create Embeddings News Prompt Engineering Formulate Response LLM: Verify against party program / prior statements LLM: Assist copy-writing for chamber, press and public Verify 💻 Party Programs Vector DB

4: The dark side Midjourney - "a postapocalyptic city run by AI, full of drones. Humans have disappeared or are enslaved. Photorealistic image"

Hallucinations

What probably happened Tokens 5+ real methods starting with SecKeyCopy

Copyright concerns

Copyright concerns The output can contain copyrighted material!

Security concerns Your input can end up in training data!

Security & Quality concerns Paper: https://arxiv.org/abs/2211.03622

Environmental concerns

Environmental concerns Source: https://blogs.nvidia.com/blog/2022/03/25/what-is-a-transformer-model

Existential concerns Asimov's first law of robotics: "a robot shall
not harm a human, or by inaction allow a human to come to harm"

Existential concerns FAKE NEWS Asimov's first law of robotics: "a
robot shall not harm a human, or by inaction allow a human to come to harm"

Things to pay attention to AI tools are useful, but:
• Ensure human supervision -> Code reviews • Be transparent about the use of AI • Pay attention to Terms & Conditions of the tools you use • Keep an eye on copyright legislation • Choose the right tool for the job ◦ ChatGPT is not the only player in town ◦ Consider open source alternatives such as LLAMA ▪ https://github.com/eugeneyan/open-llms

5: The future Midjourney - "A robot and a human holding hands, watching the sunset in the distance. Science fiction style"

Will AI replace developers? I gave it a try:

Let's see how far we can take this…

At this point, ChatGPT is lying through the teeth

Now the tricky part…

But ChatGPT is easily convinced…

Chugging along…

And we're done! Little apple, little egg

Ouch…

AI PROOF!

Will AI replace developers? Visual Basic didn't make developers obsolete
Outsourcing didn't make developers obsolete No-code systems didn't make developers obsolete AI won't make developers obsolete

Will AI replace developers? Development is so much more than
producing code. AI will make programming more productive, but we will still require software engineering.

Vertrouwelijk Aangepast voor naam van het bedrijf Versie 1.0 Thank
you! Vereniging Informatici Defensie - October 5, 2023 Midjourney - "a symposium for ministry of defense staff, with drones in attendance, sci-fi movie style" Ivo Jansch [email protected] @ijansch

Generative AI

Generative AI

More Decks by Ivo Jansch

Other Decks in Technology

Featured

Transcript