Vertrouwelijk Aangepast voor naam van het bedrijf Versie 1.0
Generative AI
Vereniging Informatici Defensie - October 5, 2023
Midjourney - "a symposium for ministry of defense staff,
with drones in attendance, sci-fi movie style"
Ivo Jansch
[email protected]
@ijansch
Slide 2
Slide 2 text
Vertrouwelijk Aangepast voor naam van het bedrijf Versie 1.0
DISCLAIMER
By the time we finish this talk it's probably
already outdated
Midjourney - "people attending a presentation in an auditorium.
There's a time machine phonebooth from Dr Who in the corner"
Slide 3
Slide 3 text
Vertrouwelijk Aangepast voor naam van het bedrijf Versie 1.0
Chapter 1:
How does Generative AI work?
Midjourney - "a brain of a robot, full of neurons.
Science fiction style"
Slide 4
Slide 4 text
2005 - 2015: Rise of Deep Learning
Introducing layers into the field of machine learning:
pixels
grouped pixels
edges
green hues
…
a cat
a cat staring
a cat staring at you
a cat at a waterhole in the jungle
Slide 5
Slide 5 text
2017: Invention of the Transformer Architecture
Has nothing to do with The Transformers
Has everything to do with a new, scalable
architecture that created the ability to process large
quantities of data and most importantly:
- Self supervised learning
- Self attention
Slide 6
Slide 6 text
Understanding attention
The cat drank from the waterhole until it was full
The cat drank from the waterhole until it was empty
Slide 7
Slide 7 text
Understanding attention
The cat drank from the waterhole until it was full
The cat drank from the waterhole until it was empty
Slide 8
Slide 8 text
2018: GPT
Generative
Predict the next word(s) from a sequence of words
Pre-trained
Trained on a large corpus of text
Transformer
Built on the transformer architecture
Slide 9
Slide 9 text
Training the model
Training Data
17 Gb for GPT-3
45 Gb for GPT-4
Language Model
175 billion parameters for GPT-3
1 trillion parameters for GPT-4
Slide 10
Slide 10 text
Training the model - word embeddings
Cat
Animal
0.3
Domestic
0.2
Hairy
0.25
Smelly
0.25
Food
0.3
PC
0.98
Mouse
0.3
other words encountered
distance to those words
Slide 11
Slide 11 text
Training the model - word embeddings
Cat
Animal
0.3
Domestic
0.2
Hairy
0.25
Smelly
0.25
Food
0.3
PC
0.98
Mouse
0.3
Mouse
Animal
0.3
Domestic
0.5
Hairy
0.25
Smelly
0.25
Food
0.7
PC
0.2
Cat
0.3
Dog
Animal
0.3
Domestic
0.2
Hairy
0.25
Smelly
0.4
Food
0.25
PC
0.99
Mouse
0.7
PC
Animal
0.98
Domestic
0.99
Hairy
0.7
Smelly
0.8
Food
0.8
Cat
0.98
Mouse
0.3
Slide 12
Slide 12 text
Word embeddings are useful for self-attention
The cat drank from the waterhole until it was full
Cat
Full
Slide 13
Slide 13 text
The model 'knows' relationships between words
Cat
Drink
Pet
Water
Fat
Rat
Dog
PC
Mouse
Jungle
Rhyme
Vast
Slide 14
Slide 14 text
The model 'knows' relationships between words
Cat
Drink
Pet
Water
Fat
Rat
Dog
PC
Mouse
Jungle
Rhyme
Vast
These are the 'parameters'
Slide 15
Slide 15 text
Then it starts predicting
Given a prompt, what is the most likely word that comes next
What is a cat?
96%
20%
94%
Fat
A pet
An animal
Slide 16
Slide 16 text
Like humans, not always the same
What is a cat?
96%
20%
94%
Fat
A pet
An animal
'Temperature' introduces some randomness
between similar answers
Slide 17
Slide 17 text
Word by word
Given a prompt, what is the most likely word that comes next
Input:
Why does a cat drink water?
Prediction:
A cat
Slide 18
Slide 18 text
Word by word by word
Given a prompt, what is the most likely word that comes next
Input:
Why does a cat drink water?
A cat
Prediction:
needs
Slide 19
Slide 19 text
Word by word by word by word
Given a prompt, what is the most likely word that comes next
Input:
Why does a cat drink water?
A cat needs
Prediction:
nutrition
Slide 20
Slide 20 text
Word by word by word by word
Given a prompt, what is the most likely word that comes next
Done:
Why does a cat drink water?
A cat needs nutrition
Slide 21
Slide 21 text
Note: a model works with tokens, not words
The cat drank from the waterhole until it was full
All tokens are converted to numbers.
17 1345 98 45 17 2624 213 21 78 723
A language model is essentially an abstract model
of relationships between numbers.
Slide 22
Slide 22 text
GPT's are "stochastic parrots"
- They "understand"
- Yet, they don't truly understand what they are talking about
- They predict which numbers are most likely to follow other
numbers, given the context of other numbers.
Slide 23
Slide 23 text
Disclaimer
That was by no means a scientifically correct explanation.
But explains in a very simplified way how it works.
Slide 24
Slide 24 text
If it works for words, it can also work for other things
Music… Images… Film…
Slide 25
Slide 25 text
News this week:
Slide 26
Slide 26 text
Vertrouwelijk Aangepast voor naam van het bedrijf Versie 1.0
Chapter 2:
Generative AI as assistant
Midjourney - "a development team sitting around a table,
everyone wearing AR glasses. Science fiction style"
Slide 27
Slide 27 text
Increase Coding Productivity
Slide 28
Slide 28 text
Coaching
Junior Developers
Slide 29
Slide 29 text
Coaching
Junior Developers
Slide 30
Slide 30 text
Designing Data Models
LLMs can help with creating data models and normalization.
Slide 31
Slide 31 text
No content
Slide 32
Slide 32 text
Vertrouwelijk Aangepast voor naam van het bedrijf Versie 1.0
Chapter 3:
Generative AI application development
Midjourney - "smart ai robots in a factory, performing
tedious manual tasks for humans, science fiction style"
Slide 33
Slide 33 text
Application areas
Enhancing
Copy writing
Gap filling
Summarising
Extracting
Analysing
Translating
Converting
Copy editing
Models are good at:
But not so suitable for factfinding / data accuracy
Slide 34
Slide 34 text
Application
Approach 1: Train your own model
Training
Huge
Dataset
Custom
LLM
Prompt
engineering
Input
Output
Slide 35
Slide 35 text
Application
Approach 2: Model finetuning
Training
Additional
Dataset
Custom
LLM
Existing
LLM
Prompt
engineering
Input
Output
Application
Approach 4: LLMs as tool in a chain
SQL Query
Additional
Dataset
Factual
Results
Summarize
Existing
LLM
Translate
to query
Input
Output
Slide 38
Slide 38 text
Example usecase
"Can AI assist with answering parliamentary questions?"
Slide 39
Slide 39 text
Example use case
"Can AI assist with answering parliamentary questions?"
Slide 40
Slide 40 text
Application
Application
Proof of concept:
LLM: Predict
Parliamentary
Questions
Historic
Parliamentary
Questions
Vector
Database
Create
Embeddings
News
Prompt
Engineering
Formulate
Response
LLM: Verify against
party program /
prior statements
LLM: Assist
copy-writing for
chamber, press
and public
Verify
💻
Party
Programs
Vector
DB
Slide 41
Slide 41 text
Vertrouwelijk Aangepast voor naam van het bedrijf Versie 1.0
Chapter 4:
The dark side
Midjourney - "a postapocalyptic city run by AI, full of drones.
Humans have disappeared or are enslaved. Photorealistic image"
Slide 42
Slide 42 text
Hallucinations
Slide 43
Slide 43 text
Hallucinations
Slide 44
Slide 44 text
Hallucinations
Slide 45
Slide 45 text
Hallucinations
Slide 46
Slide 46 text
Hallucinations
Slide 47
Slide 47 text
What probably
happened
Tokens
5+ real methods starting with
SecKeyCopy
Slide 48
Slide 48 text
Copyright
concerns
Slide 49
Slide 49 text
Copyright
concerns
The output can contain copyrighted material!
Slide 50
Slide 50 text
Security
concerns
Your input can end up in training data!
Existential
concerns
Asimov's first law of robotics: "a robot shall not harm a human, or by inaction allow a human to come to harm"
Slide 55
Slide 55 text
Existential
concerns
FAKE NEWS
Asimov's first law of robotics: "a robot shall not harm a human, or by inaction allow a human to come to harm"
Slide 56
Slide 56 text
Things to pay attention to
AI tools are useful, but:
● Ensure human supervision -> Code reviews
● Be transparent about the use of AI
● Pay attention to Terms & Conditions of the tools you use
● Keep an eye on copyright legislation
● Choose the right tool for the job
○ ChatGPT is not the only player in town
○ Consider open source alternatives such as LLAMA
■ https://github.com/eugeneyan/open-llms
Slide 57
Slide 57 text
Vertrouwelijk Aangepast voor naam van het bedrijf Versie 1.0
Chapter 5:
The future
Midjourney - "A robot and a human holding hands,
watching the sunset in the distance. Science fiction style"
Slide 58
Slide 58 text
Will AI replace developers?
I gave it a try:
Slide 59
Slide 59 text
Let's see how far
we can take this…
Slide 60
Slide 60 text
Let's see how far
we can take this…
Slide 61
Slide 61 text
At this point,
ChatGPT
is lying through
the teeth
Slide 62
Slide 62 text
Yay!
Slide 63
Slide 63 text
Now the tricky part…
Slide 64
Slide 64 text
But ChatGPT is easily convinced…
Slide 65
Slide 65 text
Chugging along…
Slide 66
Slide 66 text
And we're done! Little apple, little egg
Slide 67
Slide 67 text
Ouch…
Slide 68
Slide 68 text
No content
Slide 69
Slide 69 text
AI PROOF!
Slide 70
Slide 70 text
Will AI replace developers?
Visual Basic didn't make developers obsolete
Outsourcing didn't make developers obsolete
No-code systems didn't make developers obsolete
AI won't make developers obsolete
Slide 71
Slide 71 text
Will AI replace developers?
Development is so much more than producing code.
AI will make programming more productive,
but we will still require software engineering.
Slide 72
Slide 72 text
Vertrouwelijk Aangepast voor naam van het bedrijf Versie 1.0
Thank you!
Vereniging Informatici Defensie - October 5, 2023
Midjourney - "a symposium for ministry of defense staff,
with drones in attendance, sci-fi movie style"
Ivo Jansch
[email protected]
@ijansch