Dr.Christoph Mittendorf-Beyond Bard and Transformers: Unconventional ML Use Cases

Proprietary + Confidential Munich Datageeks Machine Learning & Data June
20th 2023 Dr. Christoph Mittendorf Machine Learning Specialist

Proprietary + Confidential Google is the pioneer in AI 2018
Google’s groundbreaking large language model, BERT 2017 Google invents Transformer kickstarting LLM revolution 2019 Text-to-Text Transfer Transformer LLM 10B P Model Open Sourced 2021 AlphaFold predicts structures proteins 2020 Google LaMDA Model Trained to converse Responsible AI 3,000 Researchers 7,000 Publications Accountable to People Built & Tested for Safety Socially Beneficial Avoid creating unfair bias Upholds high scientific standards Privacy in design GenAI / Bard 2023 A conversational AI powered by LaMDA. PaLM2 340 billion parameter model trained on 3.6 trillion tokens 2022 PaLM 540 billion parameter model Imagen realistic Text-to-Image Diffusion Model 2015 / 2016 Google Open Sourced TensorFlow

Neural Networks Building Blocks & Parameters 1

Proprietary + Conﬁdential Building AI Systems Source: Deepmind 2016 AI
• Learns solutions from first principles (data / experience) • Can generalize to new tasks Intuition rather than calculation • Enormous search space / vast amount of combinations • Inspired by Neuroscience (NN) aka our BRAIN • Deepblue (1997) vs AlphaGo (2016) 1959 AI Approach Learning Systems

Proprietary + Conﬁdential Electrocardiogram Performance Diagnostics QRS Compex PR Interval
QT Interval ST Segment PR Segment R P Q S T From Triathlon to Machine Learning

Proprietary + Conﬁdential Building AI Systems Neuroscience Heartbeat +

Proprietary + Conﬁdential Inspiration Neuroscience Structure of Neural Networks Cardiology-Science
Activation of Neurons !

Proprietary + Conﬁdential Electrocardiogram QRS Compex PR Interval QT Interval
ST Segment PR Segment R P Q S T Popular Activation Functions From Triathlon to Machine Learning

Proprietary + Conﬁdential Input Layer Hidden Layer 1 Hidden Layer
2 Output Layer Structure and Building Blocks Neural Networks Multilayer Perceptron Fully Connected Feedforward Network Deep Neural Network

Proprietary + Conﬁdential Input Weights Activation Function Prediction x 1
w 1 Structure and Building Blocks Neural Networks Number of Parameters? x 2 x 3 w 2 w 3 Y Sum Function + Bias Hidden Layer Input Layer Output Layer

Proprietary + Conﬁdential Input Weights Activation Function Output / Prediction
x 1 w 1 Structure and Building Blocks Neural Networks Number of Parameters? x 2 x 3 w 2 w 3 Y Sum Function + Bias Loss function X Optimizer Ground Truth

2 Output Layer x i w j ∑ -> Activation Function Structure and Building Blocks Neural Networks Number of Parameters?

2 Output Layer x i w j ∑ -> Activation Function Structure and Building Blocks Neural Networks Number of Parameters = 41

ST Segment PR Segment R P Q S T How to get this in code… Neural Networks

Proprietary + Conﬁdential Workbench is a Jupyter-based development environment that
is fully managed, scalable, and enterprise-ready. • Easy exploration and data analysis • Rapid prototyping and model development Electrocardiogram QRS Compex PR Interval QT Interval ST Segment PR Segment R P Q S T Neural Networks Vertex AI - Workbench

Proprietary + Conﬁdential from keras import backend as K Electrocardiogram
QRS Compex PR Interval QT Interval ST Segment PR Segment R P Q S T Iterative Process Neural Networks

ST Segment PR Segment R P Q S T Neural Networks Vertex AI - Workbench

Proprietary + Conﬁdential Heartbeat Activation Neural Networks Vertex AI -
Workbench Zoomed in Zoomed out

Proprietary + Conﬁdential Nobel Prize? Vertex AI - Workbench Mnist
Challenge - Classification of binary images of handwritten digits. Total params: 59,526 Accuracy: Number of correct predictions / Total number of predictions. 97.0% on Validation Data using Heartbeat 99.3% on Validation Data using ReLu

Key learning Maybe your Heartbeat is the Key! ML Research
can be done by everyone having the heart in the right place and good tool box. ?

Computer Vision Running Efficiency 2

Proprietary + Conﬁdential How to become a Marathon World Champion
(using ML)? Tl;dr: Eliud Kipchoge has an exceptional running efficiency. Eliud Kipchoge, born 5 November 1984, is a Kenyan long-distance runner who competes in the marathon. Widely regarded as one of the greatest marathon runners of all time, he is the 2016 and 2020 Olympic marathon champion and the world record holder in the marathon with a time of 2:01:09 set at the 2022 Berlin Marathon. He has run four of the six fastest marathons in history. In October 2019 - Eliud ran a 1:59:40 marathon - becoming the first person in recorded history to break the two-hour barrier over a marathon distance. He did so under experimental conditions including a featured a pace car and included rotating teams to maximize his running efficiency. Our Benchmark World Record Holder Eliud Kipchoge

(using ML)? MoveNet is a pretrained ML model that detects 17 keypoints of a body. The model is available on TF Hub, our open repository and library for reusable machine learning (https://www.tensorflow.org › hub) Goal: Evaluate Running Efficiency World Record Holder Eliud Kipchoge

(using ML)? World Record Holder Eliud Kipchoge The architecture consists of two components: a feature extractor and a set of prediction heads. All models are trained using the TensorFlow Object Detection API. The feature extractor in MoveNet is MobileNetV2 with an attached feature pyramid network (FPN). MoveNet Model architecture

(using ML)? World Record Holder Eliud Kipchoge There are four prediction heads attached to the feature extractor: (1) Person center heatmap: predicts the geometric center of person instances (2) Keypoint regression field: predicts full set of keypoints for a person, used for grouping keypoints into instances (3) Person keypoint heatmap: predicts the location of all keypoints, independent of person instances (4) 2D per-keypoint offset field: predicts local offsets from each output feature map pixel to the precise sub-pixel location of each keypoint MoveNet Model architecture

Proprietary + Conﬁdential Our goal is a custom classification on
“running efficiency” 26-50% Eliud 51-75% Eliud 0-25% Eliud How to become a Marathon World Champion (using ML)? World Record Holder Eliud Kipchoge Using MoveNet - we have built a Similarity Model using a similarity function that measures how similar or related two keypoints are. In detail - we are comparing certain movements (position of keypoint during steps) at a particular speed(s). Output is the Eliud Efficiency Model (EEM) Prediction: Eliud Efficiency Meets expectations Needs improvement Exceeds expectations 76-100% Eliud Superb

Proprietary + Conﬁdential One Platform - Google Cloud Architecture &
Libraries Implementation & Architecture Custom loss functions - Location loss - Contrastive loss MoveNet Pretrained Model registry Model serving Training Benchmark Eliud Kipchoge Inference Data Workbench Keypoints detection Similarity evaluation

(using ML)? World Record Holder Eliud Kipchoge The Perfect Style 100% Eliud Superb [160.26083 96.47073 ] [166.83864 89.2794 ] [154.45134 88.84522 ] [169.83588 89.157646] [141.07724 87.254 ] [171.61955 122.288956] [128.79294 126.7425 ] [177.90474 166.47029 ] [110.68105 188.49167 ] [186.85052 190.10062 ] [159.1282 183.19426 ] [160.78021 235.62056 ] [135.02223 236.45792 ] [174.34465 313.08237 ] [179.8592 317.48926 ] [157.75436 383.82657 ] [158.09479 394.32785 ] [160.26083 96.47073 ] [166.83864 89.2794 ] [154.45134 88.84522 ] [169.83588 89.157646] [141.07724 87.254 ] [171.61955 122.288956] [128.79294 126.7425 ] [177.90474 166.47029 ] [110.68105 188.49167 ] [186.85052 190.10062 ] [159.1282 183.19426 ] [160.78021 235.62056 ] [135.02223 236.45792 ] [174.34465 313.08237 ] [179.8592 317.48926 ] [157.75436 383.82657 ] [158.09479 394.32785 ] [170.6349 80.21549 ] [172.98218 75.30904 ] [165.7312 75.021126] [166.37975 75.49159 ] [147.61896 74.129814] [172.6098 106.84404 ] [124.246284 112.40009 ] [177.53781 147.2778 ] [ 86.22109 160.25734 ] [177.21365 155.98389 ] [128.97253 175.98878 ] [157.27747 224.38336 ] [131.50974 226.93161 ] [119.27184 302.12704 ] [190.04367 301.92426 ] [ 56.288105 270.6312 ] [215.00241 390.95316 ] [160.26083 96.47073 ] [166.83864 89.2794 ] [154.45134 88.84522 ] [169.83588 89.157646] [141.07724 87.254 ] [171.61955 122.288956] [128.79294 126.7425 ] [177.90474 166.47029 ] [110.68105 188.49167 ] [186.85052 190.10062 ] [159.1282 183.19426 ] [160.78021 235.62056 ] [135.02223 236.45792 ] [174.34465 313.08237 ] [179.8592 317.48926 ] [157.75436 383.82657 ] [158.09479 394.32785 ]

Proprietary + Conﬁdential [160.26083 96.47073 ] [166.83864 89.2794 ] [154.45134
88.84522 ] [169.83588 89.157646] [141.07724 87.254 ] [171.61955 122.288956] [128.79294 126.7425 ] [177.90474 166.47029 ] [110.68105 188.49167 ] [186.85052 190.10062 ] [159.1282 183.19426 ] [160.78021 235.62056 ] [135.02223 236.45792 ] [174.34465 313.08237 ] [179.8592 317.48926 ] [157.75436 383.82657 ] [158.09479 394.32785 ] [160.26083 96.47073 ] [166.83864 89.2794 ] [154.45134 88.84522 ] [169.83588 89.157646] [141.07724 87.254 ] [171.61955 122.288956] [128.79294 126.7425 ] [177.90474 166.47029 ] [110.68105 188.49167 ] [186.85052 190.10062 ] [159.1282 183.19426 ] [160.78021 235.62056 ] [135.02223 236.45792 ] [174.34465 313.08237 ] [179.8592 317.48926 ] [157.75436 383.82657 ] [158.09479 394.32785 ] How to become a Marathon World Champion (using ML)? World Record Holder Eliud Kipchoge Leveraging MoveNet’s Keypoints - we applied a Similarity Function [170.6349 80.21549 ] [172.98218 75.30904 ] [165.7312 75.021126] [166.37975 75.49159 ] [147.61896 74.129814] [172.6098 106.84404 ] [124.246284 112.40009 ] [177.53781 147.2778 ] [ 86.22109 160.25734 ] [177.21365 155.98389 ] [128.97253 175.98878 ] [157.27747 224.38336 ] [131.50974 226.93161 ] [119.27184 302.12704 ] [190.04367 301.92426 ] [ 56.288105 270.6312 ] [215.00241 390.95316 ] [160.26083 96.47073 ] [166.83864 89.2794 ] [154.45134 88.84522 ] [169.83588 89.157646] [141.07724 87.254 ] [171.61955 122.288956] [128.79294 126.7425 ] [177.90474 166.47029 ] [110.68105 188.49167 ] [186.85052 190.10062 ] [159.1282 183.19426 ] [160.78021 235.62056 ] [135.02223 236.45792 ] [174.34465 313.08237 ] [179.8592 317.48926 ] [157.75436 383.82657 ] [158.09479 394.32785 ] Step 1: Evaluating Keypoint positions for continuous frames (movement) Location loss: Euclidean L2 distance between the pixel coordinates. Step 2: Evaluating relative position between differences Keypoints (posture) Contrastive loss: Keypoint location in relation to other keypoints in the same frame.

Proprietary + Conﬁdential [344.19147 181.68451] [332.36035 171.67693] [342.86694 171.40009] [292.6452
180.04279] [329.6357 174.82632] [268.9414 238.52542] [341.15375 222.92154] [200.4838 278.67752] [354.53955 294.47366] [248.2235 321.2749 ] [381.76706 274.0878 ] [272.6782 382.45728] [306.644 387.27908] [350.14877 487.54935] [238.31728 503.66275] [306.85602 581.99554] [146.82817 608.27313] [344.19147 181.68451] [332.36035 171.67693] [342.86694 171.40009] [292.6452 180.04279] [329.6357 174.82632] [268.9414 238.52542] [341.15375 222.92154] [200.4838 278.67752] [354.53955 294.47366] [248.2235 321.2749 ] [381.76706 274.0878 ] [272.6782 382.45728] [306.644 387.27908] [350.14877 487.54935] [238.31728 503.66275] [306.85602 581.99554] [146.82817 608.27313] [347.23654 182.92076] [341.32898 172.06996] [345.10815 171.49492] [318.03317 171.599 ] [328.86688 171.00182] [312.2267 226.91055] [302.70245 210.58597] [319.71695 279.5299 ] [239.49994 233.21846] [332.75214 271.21823] [303.0601 264.2513 ] [287.06555 380.84973] [312.01486 381.8607 ] [221.09059 500.76703] [382.81763 485.10123] [122.54193 586.4088 ] [345.3879 596.1339 ] [344.19147 181.68451] [332.36035 171.67693] [342.86694 171.40009] [292.6452 180.04279] [329.6357 174.82632] [268.9414 238.52542] [341.15375 222.92154] [200.4838 278.67752] [354.53955 294.47366] [248.2235 321.2749 ] [381.76706 274.0878 ] [272.6782 382.45728] [306.644 387.27908] [350.14877 487.54935] [238.31728 503.66275] [306.85602 581.99554] [146.82817 608.27313] 70% Eliud How to become a Marathon World Champion (using ML)? World Record Holder Eliud Kipchoge Amateur Christoph Mittendorf Exceeds expectations

(using ML)? World Record Holder Eliud Kipchoge Amateur Christoph Mittendorf What are the best shoes?

(using ML)? World Record Holder Eliud Kipchoge Amateur Christoph Mittendorf ? 70% ? What can I improve immediately? 10 Euro - Flip Flops 50 Euro - Running Shoe 160 Euro - Running Shoe

Proprietary + Conﬁdential The Eliud Model - Inferences 50 Euro
- Running Shoe 160 Euro - Running Shoe 10 Euro - Flip Flops Efficiency 49% 70% 76% Suberb Exceeds expectations Meets expectations Type Benchmark

Diffusion Models ML Shoe improvement 3

Proprietary + Conﬁdential From Research to Practice… Encoder Network Decoder
Network “Encoding means to convert data into a lower-dimensional format.” “Decoding means to recreate the input data (or an association) from the encoded representation.” 'context vector'

(using ML)? Carbon Fibre Goal: Update Shoes for Chris World Record Using GenAI 160 Euro - Running Shoe Closing the gap (+24%) “Make the shoe lighter and faster so that I can fly like a bird!”

(using ML)? Carbon Fibre Updated Shoes for Chris World Record “Here is your improved shoe” 160 Euro - Running Shoe Closing the gap (+24%) Feathers for Flying Breathable material Beautiful color combination

Proprietary + Conﬁdential Getting to a 120kmh with the New
ML Feather Shoes! Old Carbon Fibre Updated Shoes for Chris World Record Fast and Flying Slow and clumsy 1

Proprietary + Conﬁdential Seek Inspiration - A shoe for every
Occasion! The Gold Medal The Best of Nike & Hoka MindCraft Gaming Shoe Christopher Street Day Original - Base Model

Proprietary + Conﬁdential Limitations Model and Result Limitations 1. Human
limitations: a. Human attributes are different b. Running speed might vary over time c. Stride length / Step size is individual 2. Technical limitations: a. Camera angle(s) influence keypoints b. Lightning conditions decrease accuracy c. Frame rate consistency is required 3. Model & Data limitation: a. Efficiency measurement is subjective b. Training data is limited (1x Elite) & (1x Nerd) c. Similarity function has model bias d. Only short video sequences were used e. Custom loss functions & customer weights for keypoints are biased Missing evaluation of Bundeswehr-Boots?

Proprietary + Conﬁdential Key learning 1. Let’s Challenge the State
of Possible 2. Seek inspiration from your personal passion 3. Combine different technologies and iterate

Proprietary + Conﬁdential Key learning 1. Let’s Challenge the State
of Possible 2. Seek inspiration from your personal passion 3. Combine different technologies and iterate 4. If we are honest - Flip Flops suck for running!

Deep Learning Model Transformers in a nutshell 4

Proprietary + Conﬁdential From Research to Practice… Research Paper The
Architecture 2017 Source: https://arxiv.org/pdf/1706.03762.pdf

Proprietary + Conﬁdential From Research to Practice… Explanation Embeddings Word
embeddings (i.e., distributed representations, word vectors) are dense feature vector representations of words in a specific dimensional space, which are usually learned by an unsupervised algorithm when fed with large amounts of tokens.

Proprietary + Conﬁdential From Research to Practice… Function Positional Encoder
Source: https://arxiv.org/pdf/1706.03762.pdf PE is a vector that gives context based on the position of the word in the sentence (a unique representation). Since we have no recurrent networks that can remember how sequences are fed into a model, we need to somehow give every item in our sequence a relative position since a sequence depends on the order of its elements. These positions are added to the embedded representation (n-dimensional vector) of each item. This is done using positional encoding which can be any function that attributes numerical position values to different parts of the input sequence.

Proprietary + Conﬁdential From Research to Practice… Function Positional Encoder
Source: https://arxiv.org/pdf/1706.03762.pdf

Proprietary + Conﬁdential From Research to Practice… Self Attention Attention
Source: https://arxiv.org/pdf/1706.03762.pdf A self-attention module works by comparing every input in the sequence to every other input in the same sequence, including itself, and reweighing the embeddings of each input to include contextual relevance. At a high level, the Self-attention block is comprised of three steps/parts: • Dot product similarity to find alignment scores • Normalization of the scores to get the weights • Reweighting of the original embeddings using the weights

Proprietary + Conﬁdential From Research to Practice… Research Paper The
Architecture “The Transformer is a magnificent neural network architecture because it is a general-purpose differentiable computer.” Advantages • Transformers process the entire input all at once – > do not forget • Transformers allow for more parallelization > excellent training on our hardware GPUs & TPUs • Transformers allow backpropagation + gradient descent > allow great optimization Source: https://arxiv.org/pdf/1706.03762.pdf Andrej Karpathy

Google Cloud PaLM 2 A true next generation LLM Multilingual
Understand, translate and generate across +100 languages Elastic Natively multi-sized with optimized scaling architecture Scientific Reasoning Next level of deep understanding from mathematics to logics Advanced Coding From code generation & completion to advanced code translation https://ai.google/static/documents/palm2techreport.pdf PaLM 2 LLM sizes Gecko: Small sized-model designed for specific use-cases - incl. interactive applications. Can run off-line incl. mobile devices Otter: Medium-sized model designed for general-purpose use including generating text, translating languages and answering question Bison: A large-sized model designed for very complex tasks incl. industry specific use-cases and advanced natural language processing Unicorn: Full- sized model designed for the most demanding tasks, incl. generating large amounts of very complex text or translating between multiple languages.

AI Systems Safety & Ethics overview

Proprietary + Confidential AI systems can only benefit the world
if we make them reliable and fair.” Google Source: Google - https://blog.google/technology/ai/update-work-ai-responsible-innovation/

Proprietary + Confidential AI systems can only benefit the world
if we make them reliable and fair.” Fairness refers to the attempt of correcting bias. Reliability is the overall consistency of a measure - it produces similar results under consistent conditions. *As it is the case with many ethical concepts, definitions of fairness and bias are always controversial.

Proprietary + Confidential A reliable and fair AI System, like
all technology, needs to be built and used responsibly.” Google Source: Google - https://blog.google/technology/ai/update-work-ai-responsible-innovation/

Proprietary + Confidential Google AI Principles AI should: be socially
beneficial avoid creating or reinforcing unfair bias be built and tested for safety be accountable to people incorporate privacy design principles uphold high standards of scientific excellence be made available for uses that accord with these principles likely to cause overall harm principal purpose to direct injury surveillance violating internationally accepted norms purpose contravenes international law and human rights Applications we will not pursue: 1 2 3 5 6 7 4 1 2 3 4 2018-today

Cloud. Nach deutschen Maßstäben. Thank you.

Dr.Christoph Mittendorf-Beyond Bard and Transfo...

Dr.Christoph Mittendorf-Beyond Bard and Transformers: Unconventional ML Use Cases

More Decks by MunichDataGeeks

Other Decks in Science

Featured

Transcript