transfer learning works? Target (Learn) Source (Given) Data Model Transferable Knowledge II. INTUITION rstanding the performance behavior of con๏ฌgurable e systems can enable (i) performance debugging, (ii) mance tuning, (iii) design-time evolution, or (iv) runtime on [11]. We lack empirical understanding of how the mance behavior of a system will vary when the environ- the system changes. Such empirical understanding will important insights to develop faster and more accurate g techniques that allow us to make predictions and ations of performance for highly con๏ฌgurable systems ging environments [10]. For instance, we can learn mance behavior of a system on a cheap hardware in a ed lab environment and use that to understand the per- ce behavior of the system on a production server before g to the end user. More speci๏ฌcally, we would like to what the relationship is between the performance of a in a speci๏ฌc environment (characterized by software ration, hardware, workload, and system version) to the t we vary its environmental conditions. is research, we aim for an empirical understanding of mance behavior to improve learning via an informed g process. In other words, we at learning a perfor- model in a changed environment based on a well-suited g set that has been determined by the knowledge we in other environments. Therefore, the main research A. Preliminary concepts In this section, we provide formal de๏ฌnitions of four con- cepts that we use throughout this study. The formal notations enable us to concisely convey concept throughout the paper. 1) Con๏ฌguration and environment space: Let Fi indicate the i-th feature of a con๏ฌgurable system A which is either enabled or disabled and one of them holds by default. The con๏ฌguration space is mathematically a Cartesian product of all the features C = Dom(F1) โฅ ยท ยท ยท โฅ Dom(Fd), where Dom(Fi) = {0, 1}. A con๏ฌguration of a system is then a member of the con๏ฌguration space (feature space) where all the parameters are assigned to a speci๏ฌc value in their range (i.e., complete instantiations of the systemโs parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W โฅH โฅV , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with con๏ฌguration space F and environmental instances E, a per- formance model is a black-box function f : F โฅ E ! R given some observations of the system performance for each combination of systemโs features x 2 F in an environment e 2 E. To construct a performance model for a system A with con๏ฌguration space F, we run A in environment instance e 2 E on various combinations of con๏ฌgurations xi 2 F, and record the resulting performance values yi = f(xi) + โi, xi 2 ON behavior of con๏ฌgurable erformance debugging, (ii) e evolution, or (iv) runtime understanding of how the will vary when the environ- mpirical understanding will op faster and more accurate to make predictions and ighly con๏ฌgurable systems or instance, we can learn on a cheap hardware in a that to understand the per- a production server before ci๏ฌcally, we would like to ween the performance of a (characterized by software and system version) to the conditions. empirical understanding of learning via an informed we at learning a perfor- ment based on a well-suited ned by the knowledge we erefore, the main research A. Preliminary concepts In this section, we provide formal de๏ฌnitions of four con- cepts that we use throughout this study. The formal notations enable us to concisely convey concept throughout the paper. 1) Con๏ฌguration and environment space: Let Fi indicate the i-th feature of a con๏ฌgurable system A which is either enabled or disabled and one of them holds by default. The con๏ฌguration space is mathematically a Cartesian product of all the features C = Dom(F1) โฅ ยท ยท ยท โฅ Dom(Fd), where Dom(Fi) = {0, 1}. A con๏ฌguration of a system is then a member of the con๏ฌguration space (feature space) where all the parameters are assigned to a speci๏ฌc value in their range (i.e., complete instantiations of the systemโs parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W โฅH โฅV , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with con๏ฌguration space F and environmental instances E, a per- formance model is a black-box function f : F โฅ E ! R given some observations of the system performance for each combination of systemโs features x 2 F in an environment e 2 E. To construct a performance model for a system A with con๏ฌguration space F, we run A in environment instance e 2 E on various combinations of con๏ฌgurations xi 2 F, and record the resulting performance values yi = f(xi) + โi, xi 2 oad, hardware and system version. e model: Given a software system A with ce F and environmental instances E, a per- is a black-box function f : F โฅ E ! R rvations of the system performance for each ystemโs features x 2 F in an environment ruct a performance model for a system A n space F, we run A in environment instance combinations of con๏ฌgurations xi 2 F, and ng performance values yi = f(xi) + โi, xi 2 (0, i). The training data for our regression mply Dtr = {(xi, yi)}n i=1 . In other words, a is simply a mapping from the input space to ormance metric that produces interval-scaled ume it produces real numbers). e distribution: For the performance model, associated the performance response to each w let introduce another concept where we ment and we measure the performance. An mance distribution is a stochastic process, that de๏ฌnes a probability distribution over sures for each environmental conditions. To ormance distribution for a system A with ce F, similarly to the process of deriving models, we run A on various combinations 2 F, for a speci๏ฌc environment instance values for workload, hardware and system version. 2) Performance model: Given a software system A with con๏ฌguration space F and environmental instances E, a per- formance model is a black-box function f : F โฅ E ! R given some observations of the system performance for each combination of systemโs features x 2 F in an environment e 2 E. To construct a performance model for a system A with con๏ฌguration space F, we run A in environment instance e 2 E on various combinations of con๏ฌgurations xi 2 F, and record the resulting performance values yi = f(xi) + โi, xi 2 F where โi โ N (0, i). The training data for our regression models is then simply Dtr = {(xi, yi)}n i=1 . In other words, a response function is simply a mapping from the input space to a measurable performance metric that produces interval-scaled data (here we assume it produces real numbers). 3) Performance distribution: For the performance model, we measured and associated the performance response to each con๏ฌguration, now let introduce another concept where we vary the environment and we measure the performance. An empirical performance distribution is a stochastic process, pd : E ! (R), that de๏ฌnes a probability distribution over performance measures for each environmental conditions. To construct a performance distribution for a system A with con๏ฌguration space F, similarly to the process of deriving the performance models, we run A on various combinations con๏ฌgurations xi 2 F, for a speci๏ฌc environment instance Extract Reuse Learn Learn Q1: How source and target are โrelatedโ? Q2: What characteristics are preserved? Q3: What are the actionable insights?