University, add Acknowledgement page. • ver0.95 2024/07/16 22:30 CA 2nd example into English • ver0.9 2024/07/16 • Edited from the presentation for Reported at the first meeting of the "Beyond Governance-Type Ethics" workshop, Ver1.0 2024/06/29 . 2024/07/17 Response Analysis Methodology and Examples 3
Graduated from Sophia University, Faculty of Science and Engineering, Department of Electrical and Electronic Engineering • Worked for a company until 2002/3 after graduation. As Engineer, and Marketing staff. • Studied sociology at Tokyo Metropolitan University Graduate School from 1990 to 1992. • 2002/4-2020/3 Worked at Sakushin Gakuin University. • Sociology • Communication theory • IT Security Issues/Information Systems Theory • Social Research Practice • 2020/3 Retired. Professor Emeritus. • After retirement from Sakushin Gakuin University, • From April 2021, I moved to be a Project Researcher of the Institute for Mathematics and Computer Science, Tsuda College. • From September 2021, I work as Invited Expert at the National Institute of Information and Communications Technology (NICT). 2024/07/17 Response Analysis Methodology and Examples 5
Correspondence Analysis Related Translations • 2015, "Introduction to Correspondence Analysis," Ohmsha, Inc. • 2020 "Theory and Practice of Correspondence Analysis," Ohmsha, Inc. • Current Research Topics • "A Study of Categorical Data Analysis Focusing on the Geometrical Structure of Data." • KAKEN DB https://bit.ly/3sOg8JI 2024/07/17 Response Analysis Methodology and Examples 6
data often coded numerically. • 1,2,3,4,5 • Do you calculate the mean or variance, even though that Number is a Name? • In Excel, if the ʼappearance' is a number, you can calculate it as a number, even though it is a nominal variable. • With ordinal variables, such a process seems even more reasonable. • But when you are coded using the five-case method, 5, 4, 3, 2, 1, is 4 double 2? And is there an equal difference of 1 between each response? To begin with, is 5 to 1 linear? 2024/07/17 Response Analysis Methodology and Examples 11
methods, like mean and variance, it is needed to convert Categorical responces to Numeric value. • Integer Scaling or Likart Scaling is certainly one of the methods. • But is it adequate to investigate the data (and reply actions) ? •CA and MCA provide the Quantification Method to preserve the data structure. 2024/07/17 Response Analysis Methodology and Examples 12
• “CA” in short. • Some call it ”Korepon." ....(in Japanese) • Multiple Correspondence Analysis 多重対応分析/다중 대응 분석 • Multiple Correspondence Analysis • “MCA” in short • In contrast, CA is sometimes called Simple CA.
Methodology and Examples 16 expand the “Variable” to Colum of Varibles Category. (indicator matrix) “indivisuals x variables” table convert to “indivisuals x categories” table .
• MCA will be performed by CA to the indicator matrix, which is converted form individuals x variable matrix. • In CA you will see “Row space” and “Col space”, it correspond to “individual spacae” and “Variable space” of MCA. 2024/07/17 Response Analysis Methodology and Examples 17
Cross table • CA: 2 variables, MCA: multivariate • In fact, the MCA is also a 2 "Variables" table: • row: Individual x column: Response Category(expanded from Response Variables). • Releated methods • Quantification for Categorical data • Chikio Hayashi, Quantification Methods type III • Shizuhiko Nishizato, "Necessity of Quantification," Kwansei Gakuin University Booklet
Sociology of Relationships as opposed to the Sociology of Variables • Adapted from the German edition of Perface, "Metier des Sociologists. • Refers to "sociology of variables" ,regression analysi • Bourdieu stated "I use Correspondence Analysis very much, because I think that it is essentially a relational procedure whose philosophy fully expresses what in my view constitutes social reality. It is a procedure that 'thinks' in relations, as I try to do with the concept of field".2 2 Preface of the German edition of Le métier de sociologue, 1991 Lebardon Frederic, 2009:13 Lebardon Frederic, 2009,”How Bourdieu ʻQuantifiedʼ Bourdieu: The Geometric Modelling of Data”, Karen Robson, Chris Sanders ed, Quantifying Theory:Pierre Bourdieu, Springer
of input data • CA and MCA • Result generated as a result of CA/MCA • Two Spaces • coordinate axis • coordinate • Two coordinates • home position • standard coordinates • Interpretation of the points with Geometric point of view • The origin point of the graph is the overroll average • The similar points are placed in close position. • Different things are placed far away each other. • What you are looking at is the row/column profile
• Two variable, 2 dimentions table • MCA is a table of rows: individuals, columns: variables, such as survey data. • However, as a process, the variable columns are expanded into variable category (choice) columns, and the CA is performed in the form of an indicator matrix with 1 in the selected cell. (One of the classical MCAs.)
Methodology and Examples 22 expand the “Variable” to Colum of Varibles Category. (indicator matrix) “indivisuals x variables” table convert to “indivisuals x categories” table .
Profiles: Row ratio, column ratio • The row ratio contains information on the selected column by row. • Calculate the distance (chi-square distance) of its row/column profile vector • Origin is the overall average point (expected value) • Similar profiles are located close together and different profiles are located far apart. • Mathematically, the residual matrix from the expected value of the input matrix is decomposed into singular values. • Matrix about row coordinates, diagonal matrix about the variance of the coordinate axes, and matrix about column coordinates
• S= U Dα Vt • U related to Row coordinate • V related to Col coordinate • αis singular value(square root of eigen value) excepcted matrix residuales diag matrix, items are inverted squared row margin diag matrix, items are inverted squared col margin standardization
SVD diag matrix, items are inverted squared row margin Dr -1/2 diag matrix, items are inverted squared col margin Dr -1/2 Φrow standard coord Γcol standard coord Frow principal coord Gcol principal coord Φ=Dr -1/2U Γ=Dc -1/2V F=ΦDα G=ΓDα result of SVD UDV dimention reduction is performed by selecting α P=M/n 2024/07/17 Response Analysis Methodology and Examples 25
1 AveColProfile col1 … coln row Sum row1 1 : 1 rowm 1 AveRowProfile 1 CA input table and two profiles Row Profile R Column Profile C 2024/07/17 Response Analysis Methodology and Examples 26 col1 … coln rowS um row1 : rowm colSum From the point of view from Row: row Analysis From the point of view from Col: col Analysis
Methodology and Examples 27 col1 … coln rowS um row1 : rown colSum m x n matrix Row space Col space Generating the space means generating axies, dim1….dimn. > these dimn has inertias or variances which are disassembled from the total variance of inputed table. > these inertial of dims are the same in Row space and Col spece.
Row profile R before CA • Column Coordinates G after CA, Column Profile C before CA are interpenetrated through the inertia (variance, eigenvalue, or squared singular value) of the coordinates generated by the CA. (transition formula) • This relationship make the relationship of “additional variables” and “additional categories”, which do not contribute to the creation of the space but acquire coordinates. • This is basic feature of Structured Data Analysis , SDA.
coordin ates F Dim1 Dim2 Oslo center northern part Column coordin ates G Dim1 Dim2 robbery fraud destructi on Eigenvalue λ Dim1 Dim2 lambda F = 𝑅𝐺𝐷 ! "#/% The column profile is also calculated by transposing, so the same form is used with respect to G = CFD .λ −1/2 2024/07/17 Response Analysis Methodology and Examples 30
and Supplymentary Variables • Active Variables • Which contribute to Genrerate the Row/Col spaces. • Supplymentary Variables • Thoese which does not contribute to the spaces but bring important information by ploted int the Row/Col spaces. 2024/07/17 Response Analysis Methodology and Examples 32
• Second write Rmarkdown scripts and Run them ! • warmingup • Example 1 Housetasks • Example 2 Gender role consciousness • Example 3 Gender role consciousness(2) SDA 2024/07/17 Response Analysis Methodology and Examples 34
data structure • basic statistical analysis • Frequency Table • Chi-squared test • Step 2 • use CA/MCA function • for SDA, structured modeling is nesessary. • check fundamental result • check inertia(variance) • check Col/variables spaces and Row/indivisual space 2024/07/17 Response Analysis Methodology and Examples 35
• check the contribution of each axis. • Name the axis • Interpret the Col/Variable positions and Row/indiv positions. • Step 4 (if needed) • Project the supplymentary points. • Interpret space (point positions and axes direction) 2024/07/17 Response Analysis Methodology and Examples 36
• copy related files to project directory • create New file as start.Rmd • knit it ! • confirm generating Html file and will be opend by your Web browser. 2024/07/17 Response Analysis Methodology and Examples 37
R on the Rstudio using .qmd/.rmd files. • If we will have enough time, please excute it step by step with me. • If there will no enough time, please watch what I will do. • You will reproduce step by step with example files. 2024/07/17 Response Analysis Methodology and Examples 38
and between col variables are defined. • but between row point and col point distance is not defined. • This is very important if you interpret symmetric map. • ★asymmetric map • if one variable use standard coordinate, it will be the box to point another point. • Direction is important information. • After grasp the direction, symmetric map is usefull to interpret the structure. 2024/07/17 Response Analysis Methodology and Examples 45
and Examples 47 If you study who is the main actor to the contens of house tasks, put the blue point as a box of the red points. first quadrant (direction) is mainly by husband. 2nd quadrand (direction ) is mainly by Wife. dim1 right is by hasband, dim1 left is by Wife, middle position is Alternating. dim2 upper part is wife or husband “alone”, lower part is “together”.
There are A) - C) opinions about the roles of men and women. What do you think? Please choose the one that is closest to your opinion on each of them. Strongly agree somewhat agree somewhat disagree strongly disagree don't know A Men should work outside the home and women should protect the home. 1 2 3 4 99 B Boys and girls should be raised differently. 1 2 3 4 99 C Women are better suited than men for housework and childcare. 1 2 3 4 99
the space generated by MCA • Interpret the reference space (objective variable) with additional variables (explanatory variables) • You can also do MCA by pulling it all together. • If so, how do you interpret the axes generated? • Interpretation of Axis on Gender Role Attitudes (Q16ABC) • If you add `age` and `gender` to the axis, is it possible to interpret axis ? • The generated coordinate axes are the new variables • Multi-dimensional variables • The effect of additional variables (categories) on this will be analyzed. • analysis of variance
• Data Preparation • Basic analysis of data • simple aggregate • cross table • chi-square test • Chi-square value and p-value • Grasping row and column analysis by mosaic plot • Relationships among variable categories by CA • Variable-to-Variable Relationships
P-value is almost zero So, the null hypothesis (job type and leisure time are unrelated) is rejected. However, we do not know what kind of relationship exists... • We need to do row and column analysis. (Before the chi- square test, this!)
a whole had is broken down into axes. The axes are composed of each point. The contribution ratio shows how much each point contributes to the total variance. Axis 1, "Church services" contributes 35% of the total for the row. contributed 35% in the rows. In the columns, "Retirees" contributes 71%. 71% contribution. This is the basis for interpreting and naming the axes. the axis and name it. 2024/07/17 Response Analysis Methodology and Examples 65
need showtext package to display CJK(Chinese Japanese Korean) characters. • You can install showtext from CRAN, and load this at the top of script. • At the graph chunk, you have to put it. be carefull ! underscore and dash. • ```{r fig_showtext=TRUE} or • ```{r} #| fig-showtext: TRUE 2024/07/17 Response Analysis Methodology and Examples 68
function, CA, MCA, PCA, and other results. • http://www.sthda.com/english/wiki/factoextra-r-package-easy-multivariate- data-analyses-and-elegant-visualization • FactoShiny • Tools by Shiny created for FactoMineR results • explor(not “explore”) • Tool to dynamically display results of various CA/MCAfunctions (Shiny) • https://juba.github.io/explor/ Interpreting results 2024/07/17 Response Analysis Methodology and Examples 70
package. this mosaic is different from base::mosaicplot. • Mosaic plot • https://cran.r- project.org/web/packages/vcdExtra/vignettes/mosaics.html • Tutorial • WorkingwithcategoricaldatawithRandthevcdand vcdExtrapackages • https://www.datavis.ca/courses/VCD/vcd-tutorial.pdf • Text book for Categorical Data Analysis • http://ddar.datavis.ca/ 2024/07/17 Response Analysis Methodology and Examples 71