Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kendall's Clustering Analysis

almo
June 17, 2021

Kendall's Clustering Analysis

Presentation of Kendall's Clustering Analysis, which is a Kendall Coefficient (consensus) based categorization of software engineers population in the software quality domain.

Delphi Analysis is a structured communication technique or method, originally developed as a systematic, interactive forecasting method which relies on a panel of experts.

Delphi Analysis was used to evaluate which is the level of coverage of the ISO 25000 by empirical quality model attributes. Kendall's Clustering Analysis evaluate the distribution of agreement for smaller clusters of experts, connecting with the taxonomy of the research.

More information
Modelling Web Component Quality Using Delphi Study https://doi.org/10.1016/j.csi.2021.103547

almo

June 17, 2021
Tweet

More Decks by almo

Other Decks in Research

Transcript

  1. Kendall's Clustering Analysis Software Engineers Associations: The Kendall Coefficient Revisited

    Andrés-Leonardo Martínez-Ortiz Ph. D. Zürich (Switzerland), 2021
  2. Table of Contents Kendall's Clusters Analysis Software Engineers Associations: The

    Kendall Coefficient Revisited 1. Research context a. Mixed Research b. Delphi Method 2. Kendall’s Clustering Analysis a. Kendall’s Coefficient b. Kendall’s Clustering Analysis 3. Software Engineers Association a. Consensus Distributions b. Dynamic Analysis c. Clusters Analysis 4. Conclusion & Future Research
  3. Mixed Methods Research Research Area: Analysis and Characterization of Quality

    Models of Open Source Web Components: • Methodologically, could quality in open source repositories be analyzed? • In open repositories, do Web Components have a quality model? • Does this model cover ISO 25010 quality characteristics? • Are these characteristics automatically evaluable? Qualitative Analysis Quantitative Analysis Instrument Development Mixed Methods Research Exploratory Sequential Design
  4. Delphi Analysis I 20 experts, encoded as EYYX - YY

    is the sequence number [00..19] - X is expertise level* [1..5] Categorization according the following criteria: - Expertise - Years of Experience - Education - Sector - Open Source Developer - QA Role in the team - Quality Static Analysis - Quality Dynamic Analysis - Web Component Technology - Use of Static Analysis (Linters) * Self assigned expertise level according to the following criteria: 1) Don’t know Web Components 2) Know Web Components but don’t use them 3) Know Web Components and use them 4) Know Web Components and develop them 5) Develop libraries of Web Components.
  5. Delphi Analysis II Sample: 20 experts Statistics: • Cronbach's alpha

    • Kendall's coefficient of concordance ◦ Inter ◦ Intra • Simple Correspondence Analysis Final round results: • Cronbach's alpha: 0.861 • Kendall's coefficient of concordance ◦ Inter: 0.987 ◦ Intra: 0.564 Experts judgement using Delphi Method determines coverage between experimental quality model and the ISO 25010 Moderate strong consensus among experts on the final result. Are there smaller clusters of experts with higher agreement? Stable
  6. Kendall’s Coefficient Application to Delphi Method, the inter round consensus

    analysis: • p experts (20) • n questions (45) • T ties correction factor Final round results: • Cronbach's alpha: 0.861 • Kendall's coefficient of concordance ◦ Inter: 0.987 ◦ Intra: 0.564 E00x E01x E18x E19x QD E00 01 QD E00 02 QD E00 44 QD E00 45 Experts (p) QD E19 01 QD E19 02 QD E19 44 QD E19 45 Questions (n) Delphi Analysis Round n-1 E00x E01x E18x E19x QD E00 01 QD E00 02 QD E00 44 QD E00 45 Experts (p) QD E19 01 QD E19 02 QD E19 44 QD E19 45 Questions (n) Delphi Analysis Round n Round n-1 Average Expert ᩿ 1/20 QD j 01 ∑ E00..E19 QD j 45 1/20 E00..E19 ∑ ᩿ ᩿ Round n Average Expert ᩿ 1/20 QD j 01 ∑ E00..E19 QD j 45 1/20 E00..E19 ∑ ᩿ ᩿ Inter Round Kendall Analysis If Kendall(R n-1 AE,R n-1 AE) ~ 1.0 then STOP Delphi Analysis R n-1 AE R n AE
  7. Definitions #1 K i is the i-distribution of Kendall Coefficient

    i.e. Kendall Coefficient of all the clusters of i experts. #2 For a Delphi Analysis with n experts, the Kendall Clustering Analysis is the combinatorial analysis of the distributions of the coefficient of Kendall for different sizes of clusters of experts i.e. K i for i=2 to n-1. Kendall’s Clustering Analysis I E00x E01x E18x E19x QD E00 01 QD E00 02 QD E00 44 QD E00 45 Experts (p) QD E19 01 QD E19 02 QD E19 44 QD E19 45 Questions (n) Delphi Analysis Final Round Kendall(E 00x , E 01x , ... ,E 19x ) = 0,564 = K 20 Kendall Clustering Analysis K 2 , K 3 , K 4 , K 5 , K 6 , K 7 , K 8 , K 9 , K 10 , K 11 , K 12 , K 13 , K 14 , K 15 , K 16 , K 17 , K 18 , K 19 For each K i →Max(K i ), Min(K i ), E(K i ) and Std(K i ) Intra Round Kendall Analysis 0.0 0.4 0.7 1.0 Weak Consensus Moderate Consensus Strong Consensus Moderate Weak Moderate Strong 0.55
  8. Maximum level of Kendall coefficient for a 4 expert cluster

    i.e. maximum agreement E05x E03x K 4 distribution Kendall’s Clustering Analysis II Example: K 4 (4845 clusters) K 4 (4845 clusters) Max(K 4 ) = Kendall(E 07x , E 10x , E 13x , E 17x ) = 0.83 Min(K 4 ) = Kendall(E 03x , E 05x , E 08x , E 19x ) = 0.43 E08x E19x E10x E17x E13x E07x Minimum level of Kendall coefficient for a 4 expert cluster i.e. minimum agreement E(K 4 ) = 0.77 Std(K 4 ) = 0.07 0.0 0.4 0.7 1.0 Weak Consensus Moderate Consensus Strong Consensus Moderate Weak Moderate Strong 0.55 Max(K 4 ) = 0.83 Min(K 4 ) = 0.43 Shapiro-Wilk W = 0.98865, p-value < 2.2e-16
  9. Software Engineers Association Clustering analysis I • K 2 -

    K 10 • Maximum • Minimum • Experts of the cluster • Overlap between maximum and minimum clusters K 6 (38760 clusters) - Biggest cluster, maximizing agreement for non overlapping experts Max(K 6 ) = Kendall(E 01x , E 05x , E 12x , E 14x , E 16x , E 17x ) = 0.786 (strong agreement) K 10 (184756 clusters) - Biggest cluster, for strong agreement, overlapping experts Max(K 10 ) = Kendall(E 01x , E 03x , E 05x , E 07x , E 12x , E 14x , E 15x , E 16x , E 17x , E 20x ,) = 0.707 (strong agreement)
  10. Software Engineers Association Clustering analysis II • K 11 -

    K 19 • Maximum • Minimum • Experts of the cluster • Overlap between maximum and minimum clusters
  11. Software Engineers Association Max(K 6 ) = Kendall(E 01x ,

    E 05x , E 12x , E 14x , E 16x , E 17x ) = 0.786 (strong agreement) K 6 (38760 clusters) - Biggest cluster, maximizing agreement for non overlapping experts Max(K 10 ) = Kendall(E01x, E03x, E05x, E07x, E12x, E14x, E15x, E16x, E17x, E20x,) = 0.707 (strong agreement) K 10 (184756 clusters) - Biggest cluster, for strong agreement, overlapping experts
  12. Kendall’s Clustering Analysis Conclusions Future Research Conclusion • Intra round

    application of Kendall’s Coefficient provides insights and evidence about the level of agreement of the experts for the stable state. • Kendall’s Distributions and dynamic analysis of Kendall Coefficient is effective analyzing the taxonomy of the panel of experts in Delphi Analysis. • This information might be use to improve Delphi Analysis. Future Research • Qualitative analysis of the cluster and taxonomy definitions. • Dynamic design of Delphi questionnaires using information from clusters. • Improvement of the algorithms to scale up with bigger clusters.
  13. Further references Modelling Web Component Quality Using Delphi Study https://doi.org/10.1016/j.csi.2021.103547

    End‐user modeling of quality for web components https://doi.org/10.1002/smr.2256 Investigación y caracterización de modelos de calidad de componentes web https://doi.org/10.20868/UPM.thesis.65879 Andrés-Leonardo Martínez-Ortiz (ORCID) https://orcid.org/0000-0001-5434-5600 Code repository https://github.com/WCMetrics/SurveyAnalysis
  14. Kendall's Clusters Analysis Software Engineers Associations: The Kendall Coefficient Revisited

    Andrés-Leonardo Martínez-Ortiz, Ph.D. Zürich (Switzerland), 2021
  15. Andrés-Leonardo Martínez-Ortiz, PhD @davilagrau Education Universidad Politécnica de Madrid BS

    Computer Science MS Computer Science PhD Software, Systems and Computing Career Google Machine Learning SRE TPgM Europe, Zürich (Switzerland) Additional details: https://www.linkedin.com/in/almo