data analysis • Some fictional examples of game play experiments – Case 1: Repetition – Case 2: Randomization – Case 3: Local control (Blocking) – Case 4: Designing experiments efficiently • Discussion on experimental study using games • Summary 2
questions • Research questions can be answered: – Deductively Theoretical research, ex. in mathematics, etc. – Inductively Confirmation, quantification, etc. of hypotheses – Abductively Generation of hypotheses Constructive approach is also categorized to this class. 3
• The values of explanatory variables are set or changed to specified ones through intervention and the resultant response is measured. • Suitable for testing or quantifying causal relationship. Observational study • Both explanatory and response variables are merely measured, sometimes with no prior distinction between them. • Cohort or case-control approach is used for causal inference. 5
it is not easy to control the values of explanatory variables. • Cohort or case-control approach may be used in such cases. • It makes analysis easier to obtain data in a balanced way. • Stratification can be combined with exploratory analysis. 7 Confirmatory Exploratory
data analysis • Some fictional examples of game play experiments – Case 1: Repetition – Case 2: Randomization – Case 3: Local control (Blocking) – Case 4: Designing experiments efficiently • Discussion on experimental study using games • Summary 8
RQ: Does factor A have an effect on response y? 9 A1 , A2 , … Levels Suppose that we let a participant play the game once with every level of factor A, and measure the response. Is it OK to answer YES, when the response values are different?
y 10 Response needs to be treated as a random variable*. A difference between realized values may be attributed to chance. What is of concern is the difference between the distributions behind them. (*According to central limit theorem, normal distribution is often assumed.) Effect?
y ($, ' () ((, ' () 11 Response needs to be treated as a random variable*. A difference between realized values may be attributed to chance. What is of concern is the difference between the distributions behind them. (*According to central limit theorem, normal distribution is often assumed.)
+ +, Assumptions – Sum of main effects is 0: $ + ( + ⋯ + 2 = 0. – Error terms independently follow a Normal distribution with mean 0. Factorial effect model for one-way design General mean Main effect of A Error ~(, ' ( 7 ) 13
( ) 2 2 2 1 1 1 1 2 1 2 1 1 1 2 ) 1 ( 1 1 e e e I i T i i I i T i i T I i T i i T I i T i i T I i T i i I i T i i I i N n i in e N I N IN tr E E E E y y E S E s s s × - = × ÷ ø ö ç è æ - = × ÷ ø ö ç è æ - = ú û ù ê ë é ÷ ø ö ç è æ - = ú ú û ù ê ê ë é ÷ ø ö ç è æ - ÷ ø ö ç è æ - = ú ú û ù ê ê ë é ÷ ø ö ç è æ - = ú ú û ù ê ê ë é ÷ ø ö ç è æ - = ú û ù ê ë é - = å å å å å å åå = = = = = = = = v v I e v v I e e v v I v v I e e v v I y v v I 16
( ) 2 1 2 2 1 2 2 1 1 2 1 1 1 2 2 1 1 2 2 1 1 2 ) 1 ( 1 1 e I i i e I i i e T I i T i i I i i T I i T i i T T I i T i i T I i i T I i T i i I i i T I i T i i I i i A I a N IN N IN a N tr a N E a N E a N E y y N E S E s s s × - + = × ÷ ø ö ç è æ - + = × ÷ ø ö ç è æ - + = ú ú û ù ê ê ë é ÷ ø ö ç è æ - ÷ ø ö ç è æ - + = ú ú û ù ê ê ë é ÷ ø ö ç è æ - + = ú ú û ù ê ê ë é ÷ ø ö ç è æ - = ú û ù ê ë é - = å å å å å å å å å å å = = = = = = = = = = = vv v v e vv v v vv v v e e vv v v y vv v v 17
DF Mean sq. F value A SA fA = I-1 VA = SA /(I-1) FA = VA /Ve Residual Se fe = I(N-1) Ve = Se /I(N-1) ― Total ST IN-1 ― ― Null hypothesis: Factor A has no effect on y ($ = ( = ⋯ = 2 = 0). 0 1 2 3 4 5 Distribution of FA when null hypothesis is true A likely value under the hypothesis => hold it An unlikely value under the hypothesis => reject it 18
• Typical hypothesis is that the factor has no effect on the response. Test statistic and its distribution under the null hypothesis • In ANOVA, statistic is F-value, known to follow F distribution. Realized value of the test statistic and its p-value • The smaller the p-value, more unlikely under the null hypothesis. Hold or reject the null hypothesis • Determined according to the p-value, but no universal threshold. 19
the results are known. • This is likely to lead to an overconfident claim, and should be avoided. Dos and don’ts • Do not turn exploratory study into confirmatory one in the middle, especially after data are obtained. • In confirmatory study, clearly specify the hypotheses and analysis methods for them before playing games. 20
data analysis • Some fictional examples of game play experiments – Case 1: Repetition – Case 2: Randomization – Case 3: Local control (Blocking) – Case 4: Designing experiments efficiently • Discussion on experimental study using games • Summary 21
RQ: Does factor A have an effect on response y? 22 A1 , A2 , … Confounder Suppose that the game is played several times in each level of A by a same player (or a team of players), but different players play in different levels. Is it an adequate experimental design?
Potential effect of factor A cannot be separately tested or evaluated from that of player. This phenomenon is called full confounding. One tool to resolve this is randomization, which translates the confounder’s effect from systematic to random error. Player 1 Player 2
y 24 Potential effect of factor A cannot be separately tested or evaluated from that of player. This phenomenon is called full confounding. One tool to resolve this is randomization, which translates the confounder’s effect from systematic to random error. Player 1 Player 2 All experimental runs are conducted in a random order.
data analysis • Some fictional examples of game play experiments – Case 1: Repetition – Case 2: Randomization – Case 3: Local control (Blocking) – Case 4: Designing experiments efficiently • Discussion on experimental study using games • Summary 25
RQ: Does factor A have an effect on response y? 26 A1 , A2 , … Suppose that the game is played several times in each level of A by a same player (or a team of players), but different players play in different levels. Is it an adequate experimental design? Confounder
A1 A2 y 27 Another tool to deal with a confounding variable is local control or blocking, which treats the confounding variable as a factor and assigns it to the experimental design. This makes it possible to separate the effect of interested factor from that of the confounder. Player 1 Player 2 Player 3 Experimental runs of each player are conducted in a random order.
+ , + +, Assumptions – Main effects satisfy: , – Interaction between block and other factors can be ignored. – Error terms follow: Model for randomized block design General mean Factor A’s Block R’s Error main effect main effect ~(, ' ( 7 ) ~(, G ( 7 ) ; +<$ 2 + = 0 29
DF Mean sq. F value A SA fA = I-1 VA = SA /fA FA = VA /Ve R SR fR = N-1 VR = SR /fR FR = VR /Ve Residual Se fe = (I-1)(N-1) Ve = Se /fe ― Total ST IN-1 ― ― 31
Randomization is simple and easy, but enlarges error variance and thus degrades the power of statistical test. • Blocking gives you more detailed result, but tends to require more experimental runs. • Blocking is only applicable to known confounders, and they need to be controllable or at least observable. • Blocking is restricted by the number of units available to each block. • Even when blocking is used, randomization within each block is recommended. 32 Day 1 Day 2 Day 3 Full randomization A2 , A3 , A3 A1 , A1 , A2 A1 , A3 , A2 Randomized block A3 , A1 , A2 A1 , A3 , A2 A1 , A2 , A3
data analysis • Some fictional examples of game play experiments – Case 1: Repetition – Case 2: Randomization – Case 3: Local control (Blocking) – Case 4: Designing experiments efficiently • Discussion on experimental study using games • Summary 33
design • The most basic experimental design, when more than two factors are taken up. • Multiple experimental runs are conducted in every possible combination of the levels of factors. • Total number of runs: × S 'T'UV WXYZ[U => Combinatorial explosion 35
of necessary experimental runs is drastically reduced, by ignoring higher order interactions. • Only a fraction of factorial design needs to be carried out. • How to construct such a design is a bit mathematical. • Orthogonal arrays are useful tool for constructing the design. • Computer support (ex. R packages) is also available. Fractional factorial design & orthogonal arrays 39
data analysis • Some fictional examples of game play experiments – Case 1: Repetition – Case 2: Randomization – Case 3: Local control (Blocking) – Case 4: Designing experiments efficiently • Discussion on experimental study using games • Summary 40
by using games applicable to the reality? • What to measure and how? • Are statistical methods suitable/effective for analyzing game data? – Games often seem to involve chaotic (irreproducible) behaviors. – Game data tend to have large variances. – There may be too many outliers. – Distribution of the data does not seem to be stable. 41 Reluctantly yes. Since no perfect alternative exists, we cannot help but need to rely on it. We should use it with care.
data analysis • Some fictional examples of game play experiments – Case 1: Repetition – Case 2: Randomization – Case 3: Local control (Blocking) – Case 4: Designing experiments efficiently • Discussion on experimental study using games • Summary 42
specify the characteristic of study and design experiments appropriately before playing games. • We should design experiments according to Fisher’s principles; 1. Repetition makes it possible to evaluate the magnitude of error variance. 2. Randomization will translate the potential effects of confounders from systematic error to random error. 3. Local control (blocking) makes it possible to separate the effects of interested factors from those of confounders. • Design of experiments (DOE) techniques will help enhance the efficiency of experiments. 43
and estimation • Fractional factorial designs, and orthogonal arrays • Linear regression, and response surface methodology • Nonparametric statistical methods • Multivariate data analysis • Data visualization, and exploratory analysis • Machine learning (supervised and unsupervised learning), etc. 44