Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Who is behind the Model? Classifying Modelers b...

Andrea Burattin
September 11, 2018

Who is behind the Model? Classifying Modelers based on Pragmatic Model Features

Process modeling tools typically aid end users in generic, non-personalized ways. However, it is well conceivable that different types of end users may profit from different types of modeling support. In this paper, we propose an approach based on machine learning that is able to classify modelers regarding their expertise while they are creating a process model. To do so, it takes into account pragmatic features of the model under development. The proposed approach is fully automatic, unobtrusive, tool independent, and based on objective measures. An evaluation based on two data sets resulted in a prediction performance of around 90%. Our results further show that all features can be efficiently calculated, which makes the approach applicable to online settings like adaptive modeling environments. In this way, this work contributes to improving the performance of process modelers.

More info: https://andrea.burattin.net/publications/2018-bpm-2

Andrea Burattin

September 11, 2018
Tweet

More Decks by Andrea Burattin

Other Decks in Science

Transcript

  1. Who is behind the Model? Classifying Modelers based on Pragmatic

    Model Features A. Burattin1, P. Soffer 2, D. Fahland 3, J. Mendling 4, H. A. Reijers 3,5, I. Vanderfeesten 3, M. Weidlich 6, B. Weber 1,7 1 Technical University of Denmark, Kgs. Lyngby, Denmark 2 University of Haifa, Haifa, Israel 3 Eindhoven University of Technology, Eindhoven, The Netherlands 4 Vienna University of Economics and Business, Vienna, Austria 5 Vrije Universiteit Amsterdam, Amsterdam, The Netherlands 6 Humboldt-University, Berlin, Germany 7 University of Innsbruck, Innsbruck, Austria
  2. Creation of process models • Creating process models is a

    complex cognitive design activity Soffer, P., Kaner, M., Wand, Y.: Towards Understanding the Process of Process Modeling: Theoretical and Empirical Considerations. In: Proc. ER-BPM'11. (2012) 357-369 Claes, J., Vanderfeesten, I., Pinggera, J., Reijers, H.A., Weber, B., Poels, G.: Visualizing the Process of Process Modeling with PPM Charts. In: Proc. TAProViz'12. (2013) 744-755 Wickens, C.D., Hollands, J.G.: Engineering Psychology and Human Performance. 3 edn. Pearson (1999) 2
  3. Creation of process models • Creating process models is a

    complex cognitive design activity • To accomplish that, the modeller has to • Construct a mental representation of the problem domain • Externalize the mental model into a process model Soffer, P., Kaner, M., Wand, Y.: Towards Understanding the Process of Process Modeling: Theoretical and Empirical Considerations. In: Proc. ER-BPM'11. (2012) 357-369 Claes, J., Vanderfeesten, I., Pinggera, J., Reijers, H.A., Weber, B., Poels, G.: Visualizing the Process of Process Modeling with PPM Charts. In: Proc. TAProViz'12. (2013) 744-755 Wickens, C.D., Hollands, J.G.: Engineering Psychology and Human Performance. 3 edn. Pearson (1999) 2
  4. Creation of process models • Creating process models is a

    complex cognitive design activity • To accomplish that, the modeller has to • Construct a mental representation of the problem domain • Externalize the mental model into a process model • Modelling is not for free: it imposes a substantial cognitive load • Cognitive load is a good predictor of task performance • Overload causes a drop in performance Soffer, P., Kaner, M., Wand, Y.: Towards Understanding the Process of Process Modeling: Theoretical and Empirical Considerations. In: Proc. ER-BPM'11. (2012) 357-369 Claes, J., Vanderfeesten, I., Pinggera, J., Reijers, H.A., Weber, B., Poels, G.: Visualizing the Process of Process Modeling with PPM Charts. In: Proc. TAProViz'12. (2013) 744-755 Wickens, C.D., Hollands, J.G.: Engineering Psychology and Human Performance. 3 edn. Pearson (1999) 2
  5. Experts and novices • Experts and novices respond differently to

    model creation tasks Batra, D., Davis, J.G.: Conceptual data modelling in database design: similarities and differences between expert and novice designers. International journal of man machine studies 37(1) (1992) 83-101 Narasimha, B., Leung, F.S.: Assisting novice analysts in developing quality conceptual models with UML. Communications of the ACM 49(7) (2006) 108-112 3
  6. Experts and novices • Experts and novices respond differently to

    model creation tasks • Novices • Challenged in integrating parts of the problem description • Challenged in mapping problem description into knowledge structures • Challenged in making abstractions (focus on specific functional details) Batra, D., Davis, J.G.: Conceptual data modelling in database design: similarities and differences between expert and novice designers. International journal of man machine studies 37(1) (1992) 83-101 Narasimha, B., Leung, F.S.: Assisting novice analysts in developing quality conceptual models with UML. Communications of the ACM 49(7) (2006) 108-112 3
  7. Experts and novices • Experts and novices respond differently to

    model creation tasks • Novices • Challenged in integrating parts of the problem description • Challenged in mapping problem description into knowledge structures • Challenged in making abstractions (focus on specific functional details) • …and experts • Tend to develop a holistic understanding • Abstract from specific problem characteristics • Categorize textual descriptions before developing solutions Batra, D., Davis, J.G.: Conceptual data modelling in database design: similarities and differences between expert and novice designers. International journal of man machine studies 37(1) (1992) 83-101 Narasimha, B., Leung, F.S.: Assisting novice analysts in developing quality conceptual models with UML. Communications of the ACM 49(7) (2006) 108-112 3
  8. The role of the modelling tool • Externalization of the

    mental model is achieved by interacting with a modelling tool 4
  9. The role of the modelling tool • Externalization of the

    mental model is achieved by interacting with a modelling tool • Modeller performs a sequence of interactions which results into [intermediate] models 4
  10. The role of the modelling tool • Externalization of the

    mental model is achieved by interacting with a modelling tool • Modeller performs a sequence of interactions which results into [intermediate] models • Differences between experts and novices suggest that modelling tool should provide different kinds of support and guidance 4
  11. The role of the modelling tool • Externalization of the

    mental model is achieved by interacting with a modelling tool • Modeller performs a sequence of interactions which results into [intermediate] models • Differences between experts and novices suggest that modelling tool should provide different kinds of support and guidance Can a modelling tool distinguish between experts and novices modellers? 4
  12. The role of the modelling tool • Externalization of the

    mental model is achieved by interacting with a modelling tool • Modeller performs a sequence of interactions which results into [intermediate] models • Differences between experts and novices suggest that modelling tool should provide different kinds of support and guidance Can a modelling tool distinguish between experts and novices modellers? 4
  13. The role of the modelling tool • Externalization of the

    mental model is achieved by interacting with a modelling tool • Modeller performs a sequence of interactions which results into [intermediate] models • Differences between experts and novices suggest that modelling tool should provide different kinds of support and guidance Can a modelling tool distinguish between experts and novices modellers? Yes. 4
  14. Table of contents • Introduction and motivation • Automatic identification

    of expertise levels • Formal requirements • Feature extraction • Classification technique • Case studies and evaluation • Conclusion and future work 5
  15. Requirements • Requirements for expertise prediction R1. Based on objective

    measures R2. Unobtrusive and no additional effort required 6
  16. Requirements • Requirements for expertise prediction R1. Based on objective

    measures R2. Unobtrusive and no additional effort required R3. Work “online” and applicable to intermediate models 6
  17. Requirements • Requirements for expertise prediction R1. Based on objective

    measures R2. Unobtrusive and no additional effort required R3. Work “online” and applicable to intermediate models R4. Be independent of the modelling tool 6
  18. Requirements • Requirements for expertise prediction R1. Based on objective

    measures R2. Unobtrusive and no additional effort required R3. Work “online” and applicable to intermediate models R4. Be independent of the modelling tool • Possible approaches 6
  19. Requirements • Requirements for expertise prediction R1. Based on objective

    measures R2. Unobtrusive and no additional effort required R3. Work “online” and applicable to intermediate models R4. Be independent of the modelling tool • Possible approaches • Self-assessment of the user… violates R1 and R3 6
  20. Requirements • Requirements for expertise prediction R1. Based on objective

    measures R2. Unobtrusive and no additional effort required R3. Work “online” and applicable to intermediate models R4. Be independent of the modelling tool • Possible approaches • Self-assessment of the user… violates R1 and R3 • Use a questionnaire to elicit expertise… violates R2 and R3 6
  21. Requirements • Requirements for expertise prediction R1. Based on objective

    measures R2. Unobtrusive and no additional effort required R3. Work “online” and applicable to intermediate models R4. Be independent of the modelling tool • Possible approaches • Self-assessment of the user… violates R1 and R3 • Use a questionnaire to elicit expertise… violates R2 and R3 • Use neuro-physiological data (e.g., EEG)… violates R2 6
  22. Requirements • Requirements for expertise prediction R1. Based on objective

    measures R2. Unobtrusive and no additional effort required R3. Work “online” and applicable to intermediate models R4. Be independent of the modelling tool • Possible approaches • Self-assessment of the user… violates R1 and R3 • Use a questionnaire to elicit expertise… violates R2 and R3 • Use neuro-physiological data (e.g., EEG)… violates R2 • Analyze interactions with modeling platform… violates R4 6
  23. Requirements • Requirements for expertise prediction R1. Based on objective

    measures R2. Unobtrusive and no additional effort required R3. Work “online” and applicable to intermediate models R4. Be independent of the modelling tool • Possible approaches • Self-assessment of the user… violates R1 and R3 • Use a questionnaire to elicit expertise… violates R2 and R3 • Use neuro-physiological data (e.g., EEG)… violates R2 • Analyze interactions with modeling platform… violates R4 • Analyze pragmatic features of (intermediate) artifacts 6
  24. General idea of the approach • After each interaction with

    the modelling tool an intermediate model is created 7
  25. General idea of the approach • After each interaction with

    the modelling tool an intermediate model is created Time Interactions with the modeling tool 7
  26. General idea of the approach • After each interaction with

    the modelling tool an intermediate model is created Time Interactions with the modeling tool Features vector = x1 x2 x3 xn Feature 1 Feature 2 Feature 3 Feature n Feature extraction 1 7
  27. General idea of the approach • After each interaction with

    the modelling tool an intermediate model is created Time Interactions with the modeling tool Features vector = x1 x2 x3 xn Feature 1 Feature 2 Feature 3 Feature n Feature extraction 1 Classification model Classification 2 Novice Expert 7
  28. Feature identification • Given a BPMN model we extract the

    following pragmatic features • Features referring to the alignment of elements Two nodes are aligned if they share at least one of the coordinates (within a threshold) 8
  29. Feature identification • Given a BPMN model we extract the

    following pragmatic features • Features referring to the alignment of elements Two nodes are aligned if they share at least one of the coordinates (within a threshold) F1. Percentage of aligned SESE fragments F2. Percentage of activities in aligned SESE fragments F3. Percentage of activities in not-aligned SESE fragments 8
  30. Feature identification • Given a BPMN model we extract the

    following pragmatic features • Features referring to the alignment of elements Two nodes are aligned if they share at least one of the coordinates (within a threshold) F1. Percentage of aligned SESE fragments F2. Percentage of activities in aligned SESE fragments F3. Percentage of activities in not-aligned SESE fragments vs. 8
  31. Feature identification (cont.) • Features referring to the type and

    usage of gateways F4. Number of explicit gateways 9
  32. Feature identification (cont.) • Features referring to the type and

    usage of gateways F4. Number of explicit gateways F5. Number of implicit gateways 9
  33. Feature identification (cont.) • Features referring to the type and

    usage of gateways F4. Number of explicit gateways F5. Number of implicit gateways F6. Number of reused gateways 9
  34. Feature identification (cont.) • Features referring to the style of

    edges F7. Percentage of orthogonal segments 10
  35. Feature identification (cont.) • Features referring to the style of

    edges F7. Percentage of orthogonal segments 10
  36. Feature identification (cont.) • Features referring to the style of

    edges F7. Percentage of orthogonal segments vs. 10
  37. Feature identification (cont.) • Features referring to the style of

    edges F7. Percentage of orthogonal segments F8. Percentage of crossing edges vs. 10
  38. Feature identification (cont.) • Feature referring to the process “as

    a whole” F9. M-BP: consistency of the flow with respect to temporal logical ordering 11
  39. Feature identification (cont.) • Feature referring to the process “as

    a whole” F9. M-BP: consistency of the flow with respect to temporal logical ordering A B C 11
  40. Feature identification (cont.) • Feature referring to the process “as

    a whole” F9. M-BP: consistency of the flow with respect to temporal logical ordering A B C A B C vs. 11
  41. Feature identification (cont.) • Feature referring to the process “as

    a whole” F9. M-BP: consistency of the flow with respect to temporal logical ordering F10.Number of ending points A B C A B C vs. 11
  42. Feature identification (cont.) • Feature referring to the process “as

    a whole” F9. M-BP: consistency of the flow with respect to temporal logical ordering F10.Number of ending points A B C A B C vs. vs. 11
  43. Datasets used for validation • Two modelling datasets collected during

    modelling sessions in TU Eindhoven and Berlin in 2010, DOI: 10.5281/zenodo.1194779 12
  44. Datasets used for validation • Two modelling datasets collected during

    modelling sessions in TU Eindhoven and Berlin in 2010, DOI: 10.5281/zenodo.1194779 • Cheetah as modelling platform 12
  45. Datasets used for validation • Two modelling datasets collected during

    modelling sessions in TU Eindhoven and Berlin in 2010, DOI: 10.5281/zenodo.1194779 • Cheetah as modelling platform • Subjects were asked to model two processes 12
  46. Datasets used for validation • Two modelling datasets collected during

    modelling sessions in TU Eindhoven and Berlin in 2010, DOI: 10.5281/zenodo.1194779 • Cheetah as modelling platform • Subjects were asked to model two processes • “pre-flight”, reference: 12
  47. Datasets used for validation • Two modelling datasets collected during

    modelling sessions in TU Eindhoven and Berlin in 2010, DOI: 10.5281/zenodo.1194779 • Cheetah as modelling platform • Subjects were asked to model two processes • “pre-flight”, reference: • “mortgage-1”, reference: 12
  48. Datasets used for validation (cont.) • Number of models and

    modelling sessions Experts Novices Sessions Intermediate models Sessions Intermediate models pre-flight 39 7299 118 14147 mortgage-1 31 4856 144 36141 13
  49. Datasets used for validation (cont.) • Number of models and

    modelling sessions Experts Novices Sessions Intermediate models Sessions Intermediate models pre-flight 39 7299 118 14147 mortgage-1 31 4856 144 36141 13
  50. Datasets used for validation (cont.) • Number of models and

    modelling sessions • Mann-Whitney U test (are features significant discriminators of expertise levels?) Experts Novices Sessions Intermediate models Sessions Intermediate models pre-flight 39 7299 118 14147 mortgage-1 31 4856 144 36141 13
  51. Datasets used for validation (cont.) • Number of models and

    modelling sessions • Mann-Whitney U test (are features significant discriminators of expertise levels?) Experts Novices Sessions Intermediate models Sessions Intermediate models pre-flight 39 7299 118 14147 mortgage-1 31 4856 144 36141 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 pre-flight p <.001 <.001 <.001 <.001 <.001 <.001 <.001 <.001 <.001 <.001 mortgage-1 p <.001 <.001 <.001 <.001 <.001 <.001 <.001 <.001 <.001 <.001 13
  52. Descriptive statistics (mean) Mean values mortgage-1 pre-flight Experts Novices Experts

    Novices F1. Alignment of fragments 0.86 > 0.81 0.82 > 0.76 F2. % acts in aligned frags 0.46 > 0.43 0.50 > 0.44 F3. % acts in not-align frags 0.09 < 0.10 0.08 < 0.10 F4. # explicit gateways 11.90 > 10.19 6.84 > 5.94 F5. # implicit gateways 1.31 < 1.58 0.37 < 0.49 F6. # reused gateways 0.34 > 0.32 0.50 > 0.47 F7. % orthogonal segments 0.71 > 0.60 0.57 > 0.49 F8. % crossing edges 0.01 < 0.02 0.012 > 0.008 F9. Flow consistency 0.95 > 0.88 0.95 > 0.91 F10. # end points 2.74 > 2.27 1.60 < 1.64 14
  53. Descriptive statistics (mean) Mean values mortgage-1 pre-flight Experts Novices Experts

    Novices F1. Alignment of fragments 0.86 > 0.81 0.82 > 0.76 F2. % acts in aligned frags 0.46 > 0.43 0.50 > 0.44 F3. % acts in not-align frags 0.09 < 0.10 0.08 < 0.10 F4. # explicit gateways 11.90 > 10.19 6.84 > 5.94 F5. # implicit gateways 1.31 < 1.58 0.37 < 0.49 F6. # reused gateways 0.34 > 0.32 0.50 > 0.47 F7. % orthogonal segments 0.71 > 0.60 0.57 > 0.49 F8. % crossing edges 0.01 < 0.02 0.012 > 0.008 F9. Flow consistency 0.95 > 0.88 0.95 > 0.91 F10. # end points 2.74 > 2.27 1.60 < 1.64 14
  54. Descriptive statistics (mean) Mean values mortgage-1 pre-flight Experts Novices Experts

    Novices F1. Alignment of fragments 0.86 > 0.81 0.82 > 0.76 F2. % acts in aligned frags 0.46 > 0.43 0.50 > 0.44 F3. % acts in not-align frags 0.09 < 0.10 0.08 < 0.10 F4. # explicit gateways 11.90 > 10.19 6.84 > 5.94 F5. # implicit gateways 1.31 < 1.58 0.37 < 0.49 F6. # reused gateways 0.34 > 0.32 0.50 > 0.47 F7. % orthogonal segments 0.71 > 0.60 0.57 > 0.49 F8. % crossing edges 0.01 < 0.02 0.012 > 0.008 F9. Flow consistency 0.95 > 0.88 0.95 > 0.91 F10. # end points 2.74 > 2.27 1.60 < 1.64 14
  55. Descriptive statistics (mean) Mean values mortgage-1 pre-flight Experts Novices Experts

    Novices F1. Alignment of fragments 0.86 > 0.81 0.82 > 0.76 F2. % acts in aligned frags 0.46 > 0.43 0.50 > 0.44 F3. % acts in not-align frags 0.09 < 0.10 0.08 < 0.10 F4. # explicit gateways 11.90 > 10.19 6.84 > 5.94 F5. # implicit gateways 1.31 < 1.58 0.37 < 0.49 F6. # reused gateways 0.34 > 0.32 0.50 > 0.47 F7. % orthogonal segments 0.71 > 0.60 0.57 > 0.49 F8. % crossing edges 0.01 < 0.02 0.012 > 0.008 F9. Flow consistency 0.95 > 0.88 0.95 > 0.91 F10. # end points 2.74 > 2.27 1.60 < 1.64 14
  56. Descriptive statistics (correlations) • Pearson correlation coefficient of features •

    Little indication of correlation: features capture complementary aspects 15
  57. Problem as classification • Classification problem • Input: 10-dimensional feature

    space (one for each feature) • Each intermediate model as independent model sample • Only models from the last 70% of the modelling session (to avoid almost-empty models) 16
  58. Problem as classification • Classification problem • Input: 10-dimensional feature

    space (one for each feature) • Each intermediate model as independent model sample • Only models from the last 70% of the modelling session (to avoid almost-empty models) • Output: likelihood of classification of each class 16
  59. Problem as classification • Classification problem • Input: 10-dimensional feature

    space (one for each feature) • Each intermediate model as independent model sample • Only models from the last 70% of the modelling session (to avoid almost-empty models) • Output: likelihood of classification of each class • We used a feed forward neural network with a hidden layer with 50 neurons • Training with Multilayer Perceptron • Software based on Weka, available at github.com/DTU-SPE/ExpertisePredictor4BPMN 16
  60. Classification performance • Tests on random datasets of (intermediate) BPMN

    model • Quality in terms of F1: harmonic mean between precision and recall • Results as average of each of the 10-fold cross validation 17
  61. Classification performance • Tests on random datasets of (intermediate) BPMN

    model • Quality in terms of F1: harmonic mean between precision and recall • Results as average of each of the 10-fold cross validation 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1000 2000 4000 8000 F-score Dataset size (number of random BPMN models used) mortgage-1 pre-flight 17
  62. Classification performance • Tests on random datasets of (intermediate) BPMN

    model • Quality in terms of F1: harmonic mean between precision and recall • Results as average of each of the 10-fold cross validation 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1000 2000 4000 8000 F-score Dataset size (number of random BPMN models used) mortgage-1 pre-flight 17
  63. Classification performance • Tests on random datasets of (intermediate) BPMN

    model • Quality in terms of F1: harmonic mean between precision and recall • Results as average of each of the 10-fold cross validation 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1000 2000 4000 8000 F-score Dataset size (number of random BPMN models used) mortgage-1 pre-flight 17
  64. Time performance • Time required to compute each of the

    10 features • Standard Java implementation (Cheetah) on standard laptop • Tests with typical PC usage maintained (to simulate modeller settings) • Average time over 18k samples from biggest dataset (mortgage-1) 18
  65. Time performance • Time required to compute each of the

    10 features • Standard Java implementation (Cheetah) on standard laptop • Tests with typical PC usage maintained (to simulate modeller settings) • Average time over 18k samples from biggest dataset (mortgage-1) 5.36 5.31 5.44 0.03 0.03 0.03 0.02 1.19 93.12 0.01 0 25 50 75 100 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 Time (ms) Features 18
  66. Time performance • Time required to compute each of the

    10 features • Standard Java implementation (Cheetah) on standard laptop • Tests with typical PC usage maintained (to simulate modeller settings) • Average time over 18k samples from biggest dataset (mortgage-1) 5.36 5.31 5.44 0.03 0.03 0.03 0.02 1.19 93.12 0.01 0 25 50 75 100 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 Time (ms) Features 18
  67. Time performance • Time required to compute each of the

    10 features • Standard Java implementation (Cheetah) on standard laptop • Tests with typical PC usage maintained (to simulate modeller settings) • Average time over 18k samples from biggest dataset (mortgage-1) 5.36 5.31 5.44 0.03 0.03 0.03 0.02 1.19 93.12 0.01 0 25 50 75 100 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 Time (ms) Features 18
  68. Conclusion and limitations • We presented an approach to classify

    modellers • Decision is based on objective measures • Decision according to artifacts being modelled • Fast computation, applicable to intermediate models • Identified requirements are all met 19
  69. Conclusion and limitations • We presented an approach to classify

    modellers • Decision is based on objective measures • Decision according to artifacts being modelled • Fast computation, applicable to intermediate models • Identified requirements are all met R1. Based on objective measures R2. Unobtrusive and no additional effort R3. “Online” and intermediate models R4. Independent of the modelling tool 19
  70. Conclusion and limitations • We presented an approach to classify

    modellers • Decision is based on objective measures • Decision according to artifacts being modelled • Fast computation, applicable to intermediate models • Identified requirements are all met • Classification results as F-score • On mortgage-1: 0.94 • On pre-flight: 0.88 (the process lacks complex behavioural structures) R1. Based on objective measures R2. Unobtrusive and no additional effort R3. “Online” and intermediate models R4. Independent of the modelling tool 19
  71. Conclusion and limitations • We presented an approach to classify

    modellers • Decision is based on objective measures • Decision according to artifacts being modelled • Fast computation, applicable to intermediate models • Identified requirements are all met • Classification results as F-score • On mortgage-1: 0.94 • On pre-flight: 0.88 (the process lacks complex behavioural structures) • Limitations • Currently only applicable to BPMN models • Big models (> 30 activities) might require more time to compute features • Same modelling task used for training and prediction R1. Based on objective measures R2. Unobtrusive and no additional effort R3. “Online” and intermediate models R4. Independent of the modelling tool 19
  72. Impact and future work • Potential impact on several aspects

    • For developers: design tools that adapt themselves to the user 20
  73. Impact and future work • Potential impact on several aspects

    • For developers: design tools that adapt themselves to the user • For educators: assess user capabilities and form groups 20
  74. Impact and future work • Potential impact on several aspects

    • For developers: design tools that adapt themselves to the user • For educators: assess user capabilities and form groups • For practitioners: recruitment, task allocation and team formation 20
  75. Impact and future work • Potential impact on several aspects

    • For developers: design tools that adapt themselves to the user • For educators: assess user capabilities and form groups • For practitioners: recruitment, task allocation and team formation • Future work include • Generalizing the task to predict models not used for training • Improve prediction of sessions rather than models • Continue the feature engineering process 20