Slide 1

Slide 1 text

© Fraunhofer IESE 1 Dr. Matthias Naab Dr. Dominik Rost 05.02.2020 OOP 2020 | München Die Rolle von Architektur im Zeitalter von KI und autonomen Systemen

Slide 2

Slide 2 text

© Fraunhofer IESE 3 Recommendation Systems

Slide 3

Slide 3 text

© Fraunhofer IESE 4 Automatic ID Verification

Slide 4

Slide 4 text

© Fraunhofer IESE 5 Traffic Prediction

Slide 5

Slide 5 text

© Fraunhofer IESE 6 Email Categorization & Spam Detection

Slide 6

Slide 6 text

© Fraunhofer IESE 7 Image tagging and filtering

Slide 7

Slide 7 text

© Fraunhofer IESE 9 Huge number of examples AI as building block in applications

Slide 8

Slide 8 text

© Fraunhofer IESE 11 Dominik Rost Matthias Naab Software Architect Software Architect Data Science ¯\_(ツ)_/¯ Artificial Intelligence ¯\_(ツ)_/¯ Autonomous Systems ¯\_(ツ)_/¯

Slide 9

Slide 9 text

© Fraunhofer IESE 12 Information Sources Company Websites, Success Stories, etc. Architecture ? Technologies, Tech Tutorials, etc.

Slide 10

Slide 10 text

© Fraunhofer IESE 13 Goals of this Talk We elaborate the topic for software architects ◼ Create ◼ “Big Picture” of architecture of ML-based Systems ◼ Architecture language for AI-based systems ◼ Foundation for ◼ Structured thinking about and designing ML-based systems ◼ Talking to AI experts and data scientists ◼ Judging existing concepts and technologies and filling the own toolbox

Slide 11

Slide 11 text

© Fraunhofer IESE 14 Approach Top-Down Bottom-Up System Decompostion according to Architecture Views Functions, Data, Deployment, … Explanation & Classification of Major Concepts ML Concepts, Process Steps, …

Slide 12

Slide 12 text

© Fraunhofer IESE 15 Example: Autonomous Driving Information partially based on „Tesla Autonomy Day“, https://www.youtube.com/watch?v=-b041NXGPZ8

Slide 13

Slide 13 text

© Fraunhofer IESE 17 Brief Repetition of Foundations

Slide 14

Slide 14 text

© Fraunhofer IESE 18 Some Terms What is Artificial Intelligence? In 5 Minutes: https://www.youtube.com/watch?v=2ePf9rue1Ao

Slide 15

Slide 15 text

© Fraunhofer IESE 19 Source: AWS Summit Berlin 2019: ML Crash Course; 15 Algorithms, 60 Minutes, No Equations

Slide 16

Slide 16 text

© Fraunhofer IESE 20 Engineering Traditional Systems vs. ML-based Systems

Slide 17

Slide 17 text

© Fraunhofer IESE 21 Traditional ML-Based Engineering Traditional Systems vs. ML-based Systems System Input Data Program Output System Input Data Expected Output Model Mix of Dimensions :(

Slide 18

Slide 18 text

© Fraunhofer IESE 22 Engineering Traditional Systems vs. ML-based Systems Software System Traditional Software Engineering (Methods & Tools) Requirements Input Data Software System Output Data Traditional DevTime RunTime SE for ML-Based Systems (Methods & Tools) Machine Learning Data Science (Methods & Tools) Software System based on ML ML Component Training Data Requirements Output Data Software System based on ML ML Component Input Data ML-Based ML-Training ML-Inference

Slide 19

Slide 19 text

© Fraunhofer IESE 23 Scope

Slide 20

Slide 20 text

© Fraunhofer IESE 25 Scope and Focus wrt. AI / ML Software System Traditional Software Engineering (Methods & Tools) used to develop Software System based on ML ML Component used to develop SE for ML-Based Systems (Methods & Tools) Data Science (Methods & Tools) used to develop used to develop System with substantial size, complexity, quality requirements

Slide 21

Slide 21 text

© Fraunhofer IESE 28 Out of Scope ◼ Foundations of ML ◼ Algorithms in detail ◼ Topology design of NNs ◼ Detailed technologies in ML ◼ Data analytics with respective tools ◼ Detailed architecture of autonomous driving systems

Slide 22

Slide 22 text

© Fraunhofer IESE 29 Functional & Data Decomposition

Slide 23

Slide 23 text

© Fraunhofer IESE 30 Data Flow through Software System based on ML Software System based on ML ML Component Input Data Output Data Data Pre- Processing Data Post- Processing

Slide 24

Slide 24 text

© Fraunhofer IESE 31 Multiple ML-Components in a System Software System based on ML ML C1 Input Data Output Data ML C3 ML C4 Architecture Decision: How many ML-Components and which ones? ML C2

Slide 25

Slide 25 text

© Fraunhofer IESE 32 Example Autonomous Driving Alternative Functions Software System based on ML “Driving” Sensors, Cameras, … Driving Actuators Data Pre- Processing Data Post- Processing Alternative A Software System based on ML Driving Area Detection Sensors, Cameras, … Driving Actuators “Steering” Obstacle Detection Roadsign Detection Alternative B Software System based on ML Road Marking Detection Sensors, Cameras, … Driving Actuators … Alternative C Example Autonomous Driving

Slide 26

Slide 26 text

© Fraunhofer IESE 33 Logical Structure of a ML Component (Generalized, Neural Network) ML Component Weights (Trained) Topology (Layers, Nodes, Relationships) Hyperparameters Config Data Basic Neural Network Logic Learning / Training Logic Code / Logic ML Model (fixed in inference) Training Data Activations State (Optional) Data Input Data Output Data Config Data The ML Component can be treated as a black box, architecturally The ML Component is the unit of training State: E.g. in Recurrent Neural Networks with feedback relationships

Slide 27

Slide 27 text

© Fraunhofer IESE 34 ML Component Example Topology of a Convolutional Neural Network (CNN) Topology (Layers, Nodes, Relationships) Decisions about the topology of the Neural Network are mainly done by data scientists. Architects need a basic understanding to judge external implications. https://www.easy-tensorflow.com/tf-tutorials/convolutional-neural-nets-cnns

Slide 28

Slide 28 text

© Fraunhofer IESE 35 Learning

Slide 29

Slide 29 text

© Fraunhofer IESE 36 Differences of Types of Learning / Training Supervised Learning Unsupervised Learning Reinforcement Learning (active) ML Component Training Data Action Labels Training Data ML Component ML Component Environment (e.g. simulated, real) Observation Reward

Slide 30

Slide 30 text

© Fraunhofer IESE 37 1 Learning / Training Step ML Component Activations State (Optional) Weights (Trained) Topology (Layers, Nodes, Relationships) Hyperparameters Basic Neural Network Logic Learning / Training Logic Data Config Data Code / Logic ML Model (fixed in inference) Output Data 2) Training logic analyses output data Selected Training Data 1) Feed training data into NN 3) Adjust … - Weights - Add / delete neurons - Add / delete relationships - Functions of neurons - Hyperparameters [by learning logic or data scientist)

Slide 31

Slide 31 text

© Fraunhofer IESE 39 Overall Lifecycles / Workflows and Data Involved ML-Training (DevTime) Data Collection Data Ingestion Data Preparation Model Selection & Training & Evaluation Model Persistence ML-Inference (RunTime) Data Ingestion Data Preparation Inference Model Deployment Large amounts of data Computing-intensive training Exploratory approach Concrete input data Inference is comparably cheap “Just computation”

Slide 32

Slide 32 text

© Fraunhofer IESE 40 Feedback Data and Optimization (Batch Learning) ML-Training (DevTime) Data Collection Data Ingestion Data Preparation Model Selection & Training & Evaluation Model Persistence ML-Inference (RunTime) Data Ingestion Data Preparation Inference Model Deployment New Training Data from Live Operation Deploy optimized model

Slide 33

Slide 33 text

© Fraunhofer IESE 41 Example Autonomous Driving Tesla: Data Collection from Current Fleet – Driving Real World, not Autonomously Yet Deploy optimized driving functions model New Training Data from Live Operation Camera images Driving situations Data labelled from driver behaviour / steering Data labelled from explicit user feedback Data labelled from additional sensors (e.g. radar) Central Data Collection and Learning Model Selection & Training & Evaluation Data Preparation Model Persistence Instruct cars, which data to collect Partially human pre-processed data

Slide 34

Slide 34 text

© Fraunhofer IESE 42 Example Autonomous Driving Software System based on ML “Driving” Data Pre- Processing Data Post- Processing Central Data Collection and Learning Model Selection & Training & Evaluation Data Preparation Model Persistence ▪ Architects need overall system perspective ▪ Strong integration between runtime system (cars) and development time (learning and improvement) ▪ Continuous improvement and deployment ▪ Learning from the pre-phase of autonomous driving and continuously during operation

Slide 35

Slide 35 text

© Fraunhofer IESE 43 Feedback Data and Optimization (Online Learning) ML-Inference (RunTime) Data Ingestion Data Preparation Inference ML-Training / Retraining (RunTime) Model Training / Optimization Model Persistence Model Deployment New Training Data from Live Operation Learning can happen at defined points in time (rather not after every inference) (DevTime) a ation Model Selection & Training Model Persistence Model Deployment The data science work is still done at DevTime Model is selected and training is done At Runtime, only optimization of the model

Slide 36

Slide 36 text

© Fraunhofer IESE 44 Data

Slide 37

Slide 37 text

© Fraunhofer IESE 45 Data Aspects ◼ Large amounts of data for training needed ◼ Amount depends on application area, available data and on ML models / algorithms ◼ Very different types and formats of data ◼ Text ◼ Images ◼ Video ◼ Audio ◼ … ◼ → require very different treatment ◼ → result in very different computational load

Slide 38

Slide 38 text

© Fraunhofer IESE 46 Example Autonomous Driving Data Aspects in Autonomous Driving ◼ Data needs ◼ Large data ◼ Varied data ◼ Real data ◼ Collect data from the fleet ◼ Create simulation data ◼ Cover edge and unusual cases Image: https://www.youtube.com/watch?v=-b041NXGPZ8

Slide 39

Slide 39 text

© Fraunhofer IESE 47 Deployment

Slide 40

Slide 40 text

© Fraunhofer IESE 48 Design Alternatives Deployment Options Model Training / Optimization Model Persistence ML-Inference (RunTime) Data Ingestion Data Preparation Inference Model Deployment (DevTime) a ation Model Selection & Training Model Persistence Model Deployment Training HW Powerful Server ML Component Client Server ML Component Client ML Component Server Design Alternatives Client Server ML Component Client ML Component Server ML-Training (RunTime)

Slide 41

Slide 41 text

© Fraunhofer IESE 50 Example Autonomous Driving Multiple Instances of Systems (Cars) ◼ Learning strategies ◼ Online-Learning in each car? ◼ Batch-Learning in a central system, only? ◼ Can cars communicate? ◼ Compare: ◼ Learning of typing recognition on mobile phone Training HW Powerful Server ML Component ML-Training (DevTime) ML-Inference (RunTime) New Training Data from Live Operation Deploy optimized model Software System based on ML ML Component Software System based on ML ML Component Software System based on ML ML Component Software System based on ML ML Component Software System based on ML ML Component Software System based on ML ML Component

Slide 42

Slide 42 text

© Fraunhofer IESE 51 Technologies

Slide 43

Slide 43 text

© Fraunhofer IESE 52 Available Technologies as Services / Libraries for ML Different Level of Reuse ML Component Activations State (Optional) Weights (Trained) Topology (Layers, Nodes, Relationships) Hyperparameters Basic Neural Network Logic Learning / Training Logic Data Config Data Code / Logic ML Model (fixed in inference) Fully trained model, immutable (as API or library) [e.g. Service for image tagging] Fully trained model, retrainable (as API or library) [e.g. Service for image tagging] Predefined topology (as API or library) [e.g. predefined CNNs] Basic ML model (as library) [e.g. general NN logic] Degree of freedom Knowledge needed Effort needed

Slide 44

Slide 44 text

© Fraunhofer IESE 53 Microsoft AI & ML Technologies https://www.credera.com/wp-content/uploads/2018/04/The-Microsoft-AI-platform.png Fully trained model, immutable (as API or library) [e.g. Service for image tagging] Fully trained model, retrainable (as API or library) [e.g. Service for image tagging] Predefined topology (as API or library) [e.g. predefined CNNs] Basic ML model (as library) [e.g. general NN logic] Predefined topology (as API or library) [e.g. predefined CNNs]

Slide 45

Slide 45 text

© Fraunhofer IESE 54 AWS AI & ML Technologies https://www.quantiphi.com/machine_learning/

Slide 46

Slide 46 text

© Fraunhofer IESE 59 Deployment / Technology Options ◼ Types of Hardware ◼ CPU ◼ GPU ◼ NPU

Slide 47

Slide 47 text

© Fraunhofer IESE 60 Example Autonomous Driving Tesla: Own „Full Self-Driving“ Computer

Slide 48

Slide 48 text

© Fraunhofer IESE 61 Relationship to Quality Attributes

Slide 49

Slide 49 text

© Fraunhofer IESE 63 Quality Attributes in ML-based Systems (1/2) ◼ ML as a technology does inherently aim more at realizing functionality than at realizing quality attributes (in contrast to e.g. communication middleware, blockchain, …) ◼ However, ML can be used to support achieving some quality attributes (e.g. achieving certain aspects of security by for example detecting attack patterns with ML) ◼ The usage of ML has significant impact on quality attributes , and thus needs architectural treatment ◼ One key aspect: missing comprehensibility / explainability what is happening in the ML- component ◼ Safety, reliability: conflicting with safety standards, needs counter-measures ◼ UX: Explaining to the user what happens / integrating user into overall flow

Slide 50

Slide 50 text

© Fraunhofer IESE 64 Quality Attributes in ML-based Systems (2/2) ◼ Fulfil the respective quality attributes of the system, respecting the overall “scale” of the system ◼ Performance (Latency, throughput), scalability, … ◼ Considering the runtime system, but also the devtime / learning system ◼ Completely different settings for quality attributes in different systems ◼ Playing “Go” against the world champion ◼ Massive power on a single complex task ◼ Calculation of a model for all product recommendations of Amazon ◼ Massive power on many smaller tasks ◼ Provide an adequate execution environment ◼ Sufficient computing power ◼ Sufficient storage capacity ◼ Provide the right data with adequate frequency and latency ◼ Architect has to know the requirements / implications of the ML algorithm / model

Slide 51

Slide 51 text

© Fraunhofer IESE 66 Conclusion: What does it mean for me? What can I do? ◼ Keep an eye on the architectural big picture, even if there is ML in the system ;-) ◼ Understand the very nature of ML-based systems ◼ Learn from existing systems and their solution approaches ◼ Remember the essentials of software architecture ◼ Achieving Quality attributes ◼ Dealing with uncertainty ◼ Organizing and distributing work ◼ Fill your toolbox with knowledge about patterns and technologies in the ML-area ◼ Start working with data scientists / data engineers and establish a common language

Slide 52

Slide 52 text

© Fraunhofer IESE 67 Dr. Matthias Naab Dr. Dominik Rost 05.02.2020 OOP 2020 | München Die Rolle von Architektur im Zeitalter von KI und autonomen Systemen