Machine Learning meets Software Performance

Slide 1

Slide 1 text

Machine Learning meets Software Performance A journey from optimization to transfer learning all the way to counterfactual causal inference ASE 2020 Tutorial Friday 25 Sep Pooyan Jamshidi UofSC Europa Lander NASA Europa Clipper NASA

Slide 46

Slide 46 text

We developed methods to make learning cheaper via transfer learning 46 Target (Learn) Source (Given) Data Model Transferable Knowledge II. INTUITION Understanding the performance behavior of configurable software systems can enable (i) performance debugging, (ii) performance tuning, (iii) design-time evolution, or (iv) runtime adaptation [11]. We lack empirical understanding of how the performance behavior of a system will vary when the environment of the system changes. Such empirical understanding will provide important insights to develop faster and more accurate learning techniques that allow us to make predictions and optimizations of performance for highly configurable systems in changing environments [10]. For instance, we can learn performance behavior of a system on a cheap hardware in a controlled lab environment and use that to understand the performance behavior of the system on a production server before shipping to the end user. More specifically, we would like to know, what the relationship is between the performance of a system in a specific environment (characterized by software configuration, hardware, workload, and system version) to the one that we vary its environmental conditions. In this research, we aim for an empirical understanding of A. Preliminary concept In this section, we p cepts that we use throu enable us to concisely 1) Configuration and the i-th feature of a co enabled or disabled an configuration space is m all the features C = Dom(Fi) = {0, 1}. A a member of the confi all the parameters are range (i.e., complete ins We also describe an e = [w, h, v] drawn fr W ⇥H ⇥V , where they values for workload, ha 2) Performance mod configuration space F formance model is a b given some observation combination of system II. INTUITION rstanding the performance behavior of configurable systems can enable (i) performance debugging, (ii) ance tuning, (iii) design-time evolution, or (iv) runtime on [11]. We lack empirical understanding of how the ance behavior of a system will vary when the environ- the system changes. Such empirical understanding will important insights to develop faster and more accurate techniques that allow us to make predictions and ations of performance for highly configurable systems ging environments [10]. For instance, we can learn ance behavior of a system on a cheap hardware in a ed lab environment and use that to understand the per- e behavior of the system on a production server before to the end user. More specifically, we would like to hat the relationship is between the performance of a n a specific environment (characterized by software ation, hardware, workload, and system version) to the we vary its environmental conditions. s research, we aim for an empirical understanding of A. Preliminary concepts In this section, we provide formal definitions of cepts that we use throughout this study. The forma enable us to concisely convey concept throughout 1) Configuration and environment space: Let F the i-th feature of a configurable system A whic enabled or disabled and one of them holds by de configuration space is mathematically a Cartesian all the features C = Dom(F1) ⇥ · · · ⇥ Dom(F Dom(Fi) = {0, 1}. A configuration of a syste a member of the configuration space (feature spa all the parameters are assigned to a specific valu range (i.e., complete instantiations of the system’s pa We also describe an environment instance by 3 e = [w, h, v] drawn from a given environment s W ⇥H ⇥V , where they respectively represent sets values for workload, hardware and system version. 2) Performance model: Given a software syste configuration space F and environmental instances formance model is a black-box function f : F ⇥ given some observations of the system performanc combination of system’s features x 2 F in an en formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and record the resulting performance values yi = f(xi) + ✏i, xi 2 F where ✏i ⇠ N (0, i). The training data for our regression models is then simply Dtr = {(xi, yi)}n i=1 . In other words, a response function is simply a mapping from the input space to a measurable performance metric that produces interval-scaled data (here we assume it produces real numbers). 3) Performance distribution: For the performance model, we measured and associated the performance response to each configuration, now let introduce another concept where we vary the environment and we measure the performance. An empirical performance distribution is a stochastic process, pd : E ! (R), that defines a probability distribution over performance measures for each environmental conditions. To tem version) to the ons. al understanding of g via an informed learning a perfor- ed on a well-suited the knowledge we the main research information (trans- o both source and ore can be carried r. This transferable [10]. s that we consider uration is a set of is the primary vari- rstand performance ike to understand der study will be formance model is a black-box function f : F ⇥ E given some observations of the system performance fo combination of system’s features x 2 F in an enviro e 2 E. To construct a performance model for a syst with configuration space F, we run A in environment in e 2 E on various combinations of configurations xi 2 F record the resulting performance values yi = f(xi) + ✏i F where ✏i ⇠ N (0, i). The training data for our regr models is then simply Dtr = {(xi, yi)}n i=1 . In other wo response function is simply a mapping from the input sp a measurable performance metric that produces interval- data (here we assume it produces real numbers). 3) Performance distribution: For the performance m we measured and associated the performance response t configuration, now let introduce another concept whe vary the environment and we measure the performanc empirical performance distribution is a stochastic pr pd : E ! (R), that defines a probability distributio performance measures for each environmental conditio Extract Reuse Learn Learn Goal: Gain strength by transferring information across environments

Slide 58

Slide 58 text

Key question: Can we develop a theory to explain when transfer learning works? 58 Target (Learn) Source (Given) Data Model Transferable Knowledge II. INTUITION rstanding the performance behavior of configurable e systems can enable (i) performance debugging, (ii) mance tuning, (iii) design-time evolution, or (iv) runtime on [11]. We lack empirical understanding of how the mance behavior of a system will vary when the environ- the system changes. Such empirical understanding will important insights to develop faster and more accurate g techniques that allow us to make predictions and ations of performance for highly configurable systems ging environments [10]. For instance, we can learn mance behavior of a system on a cheap hardware in a ed lab environment and use that to understand the per- ce behavior of the system on a production server before g to the end user. More specifically, we would like to what the relationship is between the performance of a in a specific environment (characterized by software ration, hardware, workload, and system version) to the t we vary its environmental conditions. is research, we aim for an empirical understanding of mance behavior to improve learning via an informed g process. In other words, we at learning a perfor- model in a changed environment based on a well-suited g set that has been determined by the knowledge we in other environments. Therefore, the main research A. Preliminary concepts In this section, we provide formal definitions of four concepts that we use throughout this study. The formal notations enable us to concisely convey concept throughout the paper. 1) Configuration and environment space: Let Fi indicate the i-th feature of a configurable system A which is either enabled or disabled and one of them holds by default. The configuration space is mathematically a Cartesian product of all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where Dom(Fi) = {0, 1}. A configuration of a system is then a member of the configuration space (feature space) where all the parameters are assigned to a specific value in their range (i.e., complete instantiations of the system’s parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W ⇥H ⇥V , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a performance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and record the resulting performance values yi = f(xi) + ✏i, xi 2 ON behavior of configurable erformance debugging, (ii) e evolution, or (iv) runtime understanding of how the will vary when the environ- mpirical understanding will op faster and more accurate to make predictions and ighly configurable systems or instance, we can learn on a cheap hardware in a that to understand the per- a production server before cifically, we would like to ween the performance of a (characterized by software and system version) to the conditions. empirical understanding of learning via an informed we at learning a perfor- ment based on a well-suited ned by the knowledge we erefore, the main research A. Preliminary concepts In this section, we provide formal definitions of four concepts that we use throughout this study. The formal notations enable us to concisely convey concept throughout the paper. 1) Configuration and environment space: Let Fi indicate the i-th feature of a configurable system A which is either enabled or disabled and one of them holds by default. The configuration space is mathematically a Cartesian product of all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where Dom(Fi) = {0, 1}. A configuration of a system is then a member of the configuration space (feature space) where all the parameters are assigned to a specific value in their range (i.e., complete instantiations of the system’s parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W ⇥H ⇥V , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a performance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and record the resulting performance values yi = f(xi) + ✏i, xi 2 oad, hardware and system version. e model: Given a software system A with ce F and environmental instances E, a per- is a black-box function f : F ⇥ E ! R rvations of the system performance for each ystem’s features x 2 F in an environment ruct a performance model for a system A n space F, we run A in environment instance combinations of configurations xi 2 F, and ng performance values yi = f(xi) + ✏i, xi 2 (0, i). The training data for our regression mply Dtr = {(xi, yi)}n i=1 . In other words, a is simply a mapping from the input space to ormance metric that produces interval-scaled ume it produces real numbers). e distribution: For the performance model, associated the performance response to each w let introduce another concept where we ment and we measure the performance. An mance distribution is a stochastic process, that defines a probability distribution over sures for each environmental conditions. To ormance distribution for a system A with ce F, similarly to the process of deriving models, we run A on various combinations 2 F, for a specific environment instance values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a performance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and record the resulting performance values yi = f(xi) + ✏i, xi 2 F where ✏i ⇠ N (0, i). The training data for our regression models is then simply Dtr = {(xi, yi)}n i=1 . In other words, a response function is simply a mapping from the input space to a measurable performance metric that produces interval-scaled data (here we assume it produces real numbers). 3) Performance distribution: For the performance model, we measured and associated the performance response to each configuration, now let introduce another concept where we vary the environment and we measure the performance. An empirical performance distribution is a stochastic process, pd : E ! (R), that defines a probability distribution over performance measures for each environmental conditions. To construct a performance distribution for a system A with configuration space F, similarly to the process of deriving the performance models, we run A on various combinations configurations xi 2 F, for a specific environment instance Extract Reuse Learn Learn Q1: How source and target are “related”? Q2: What characteristics are preserved? Q3: What are the actionable insights?

Slide 66

Slide 66 text

66 Transfer Learning for Performance Modeling of Configurable Systems: An Exploratory Analysis Pooyan Jamshidi Carnegie Mellon University, USA Norbert Siegmund Bauhaus-University Weimar, Germany Miguel Velez, Christian K¨ astner Akshay Patel, Yuvraj Agarwal Carnegie Mellon University, USA Abstract—Modern software systems provide many configuration options which significantly influence their non-functional properties. To understand and predict the effect of configuration options, several sampling and learning strategies have been proposed, albeit often with significant cost to cover the highly dimensional configuration space. Recently, transfer learning has been applied to reduce the effort of constructing performance models by transferring knowledge about performance behavior across environments. While this line of research is promising to learn more accurate models at a lower cost, it is unclear why and when transfer learning works for performance modeling. To shed light on when it is beneficial to apply transfer learning, we conducted an empirical study on four popular software systems, varying software configurations and environmental conditions, such as hardware, workload, and software versions, to identify the key knowledge pieces that can be exploited for transfer learning. Our results show that in small environmental changes (e.g., homogeneous workload change), by applying a linear transformation to the performance model, we can understand the performance behavior of the target environment, while for severe environmental changes (e.g., drastic workload change) we can transfer only knowledge that makes sampling more efficient, e.g., by reducing the dimensionality of the configuration space. Index Terms—Performance analysis, transfer learning. Fig. 1: Transfer learning is a form of machine learning that takes advantage of transferable knowledge from source to learn an accurate, reliable, and less costly model for the target environment. their byproducts across environments is demanded by many Details: [ASE ’17]

Slide 1

Slide 1 text

Slide 2

Slide 2 text

Slide 3

Slide 3 text

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Slide 22

Slide 22 text

Slide 23

Slide 23 text

Slide 24

Slide 24 text

Slide 25

Slide 25 text

Slide 26

Slide 26 text

Slide 27

Slide 27 text

Slide 28

Slide 28 text

Slide 29

Slide 29 text

Slide 30

Slide 30 text

Slide 31

Slide 31 text

Slide 32

Slide 32 text

Slide 33

Slide 33 text

Slide 34

Slide 34 text

Slide 35

Slide 35 text

Slide 36

Slide 36 text

Slide 37

Slide 37 text

Slide 38

Slide 38 text

Slide 39

Slide 39 text

Slide 40

Slide 40 text