run 50 times for each test function. Each run stops either when the error obtained is less than 10−8, or when the maximal number of evaluations (100000) is achieved. 4
not significantly different from each other Test statistic (difference of sets) follows T-Distribution T-Distribution Distribution of the location of the true mean, relative to the sample mean and divided by the sample standard deviation. (it’s difference) 7
others H0 : , − ,1 ≥ 0 = , − ,1 ≤ 0 = 1 2 rj ⩽ Rj rejects H0 rj : number of xi,j - xi,1 that have the less frequently occurring sign Rj : Table for Multiple Comparison Sign test Example k = 9 and n = 25 ⇒ Rj = 5 SaDE outperforms PSO and CHC Only this result! 10
problem sets H0 : medians of the algorithms are equal = (−1) 2 − 2 4 (+1)2 (+1)(2+1) 6 − 1 2 ~ −1 2 rejects H0 Cell value minus column mean (value of location) 12
equal in terms of importance Problem rank is the difference between the largest and the smallest observations within that problem First rank has minimum range Problem weight , = , , , = , − +1 2 = −1 − H0 : ~ F-Distribution k−1, (k−1)(n−1) 13
false hypothesis are possible M1 better than M2 , M1 same as M3 , M2 same as M3 Adjustments Shaffer’s static = min , 1 ; = max : 1 ≤ ≤ : Maximum number of hypothesis which can be true given that i -1 hypothesis are false Bergman-Hommel = min , 1 ; = max , ∈ : ℎ; ∈ Finding all elementary hypothesis which can’t be rejected Exhaustive set: hypothesis that all of them could be true 19
How many comparison are you looking for? Pairwise comparison Multiple comparison Would you mind level of significance? Sign test Rank test Problem difficulty Qaude test Taking into account relative algorithm comparisons Post-hoc adjustments 21
Herrera, “A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms,” Swarm and Evolutionary Computation, vol. 1, no. 1, pp. 3–18, Mar. 2011. 22