PROBLEMS 1. Add the variable you want to use as a filter 2. Specify the conditions for inclusion. Separate multiple conditions with the command “or” or “And”. Pay attention to quotes, spaces, capitalizations. Variable_Name = 'Value to include' OR Variable_Name = 'Second value to include'
PROBLEMS “The relationship between Instagram followers and Canadian born status is significant. The relationship between Instagram time and Canadian born status is not significant. As well as the relationship between Instagram followers and Instagram time is not significant. “ A crosstab shows the relation (or lack thereof) between independent and dependent variable.
PROBLEMS “The relationship between Instagram followers and time on instagram is spurious“ Why? Always mention the significance (is it above or below 10%) for both zero order relationship and partials.
A HYPOTHESIS Hypothesis: People who have a political affiliation will tend to read news more carefully and to investigate more to understand is something they read online is a legit news or misinformation. On the contrary, people without a political affiliation will be less likely to spend time investigating news sources. Control variable: We will control the relation between political affiliation and tendency to investigate news sources through a variable measuring how much of an impact social media have on media consumption. Null hypothesis: there is no relationship between political affiliation and tendency to investigate news sources.
• Independent variable: ◦ @2718068 (Which Canadian federal political party do you think best represents your personal political orientation?) • Dependent variable: ◦ @2864544 (When presented with news media on social media platforms, how often do you further analyze or investigate the news media in question in order to discern misinformation?) • Control variable: ◦ @2864568 (How much of an impact does social media have on your media consumption?)
All variables (independent, dependent and control) must be binary. Which means, must include only 2 values (for example, younger, older, more followers, less followers, high GPA, low GPA,etc.). Therefore, it is very likely you will have to recode your variables in order to convert them from their original format to a binary form (for how to recode variables, see week 3 lab).
binary. It has 7 values. We need to find a logic for creating 2 groups only: 1. People with political affiliation 2. People without political affiliation In your assignment, explain the login you followed to group your variable’s values. Photo: Startup Weekend Hackathon. Nov.2014 STEP 2: RECODING YOUR VARIABLES 2.2 RECODING THE INDEPENDENT VARIABLE
In the “Prefer not to say” value there might be people with political affiliation and people without political affiliation. For this reason, we do not recode this value intentionally. We do not recode this value so that we can remove these answers later 2.2 RECODING THE INDEPENDENT VARIABLE
Weekend Hackathon. Nov.2014 STEP 2: RECODING YOUR VARIABLES 1. Once recoded, drag and drop the new variable next to the original variable. 2. Sort the dataset Ascending using the original variable (right click on the original variable name > Sort Ascending). 3. In this way you can easily check if the old variable was correctly recoded into the new one.
Then, run a quick Descriptive Statistics on the new variable to check once again that the independent variable was properly recoded. 2.2 CHECKING IF THE VARIABLE WAS PROPERLY RECODED
OK OK Notice how the table has 3 rows. This means that the variable contains 3 values. “Does no have political opinion”, “Has political opinion” and blank. The Blank value includes missing responses and the ‘Prefer not to say” responses that we decided not to recode. Blank responses (the first row) must be removed. We’ll clean the data later, in step 3. 2.2 CHECKING IF THE VARIABLE WAS PROPERLY RECODED
the Dependent variable. Right click on the dependent variable name and select Descriptive Statistics. STEP 2: RECODING YOUR VARIABLES 2.3 RECODING THE DEPENDENT VARIABLE
The univariate analysis shows that the variable @2864544 is not binary. It has 5 values. We need to find a logic for creating 2 groups only: 1. People who are more likely to investigate news sources in order to discern misinformation 2. People who are less likely to investigate news sources in order to discern misinformation In your assignment, explain the login you followed to group your variable’s values. 2.3 RECODING THE DEPENDENT VARIABLE
Always, Often or Sometimes will be grouped into the “More likely to investigate news” value of the new recoded variable. STEP 2: RECODING YOUR VARIABLES 2.3 RECODING THE DEPENDENT VARIABLE
answered Not often or Never will be grouped into the “Less likely to investigate news” value of the new recoded variable. STEP 2: RECODING YOUR VARIABLES 2.3 RECODING THE DEPENDENT VARIABLE
Weekend Hackathon. Nov.2014 STEP 2: RECODING YOUR VARIABLES 1. Once recoded, drag and drop the new variable next to the original variable. 2. Sort the dataset Ascending using the original variable (right click on the original variable name > Sort Ascending). 3. In this way you can easily check if the old variable was correctly recoded into the new one.
Then, run a quick Descriptive Statistics on the new variable to check once again that the dependent variable was properly recoded. 2.3 CHECKING IF THE VARIABLE WAS PROPERLY RECODED
the variable contains 3 values. “Less likely to investigate news”, “More likely to investigate news” and blank. The Blank value includes missing responses. Blank responses (the first row) must be removed. We’ll clean the data later, in step 3. Photo: Startup Weekend Hackathon. Nov.2014 STEP 2: RECODING YOUR VARIABLES OK OK 2.3 CHECKING IF THE VARIABLE WAS PROPERLY RECODED
the variable @2864568 is not binary. It has 5 values. We need to find a logic for creating 2 groups only: 1. People whose media consumption choices are impacted by social media 2. People whose media consumption choices are not impacted by social media In your assignment, explain the login you followed to group your variable’s values. STEP 2: RECODING YOUR VARIABLES 2.4 RECODING THE CONTROL VARIABLE
Above Average and Average will be grouped into the “Social media have impact” value of the new recoded variable. STEP 2: RECODING YOUR VARIABLES 2.4 RECODING THE CONTROL VARIABLE
answered Very Rarely and Not at all will be grouped into the “Social media do not have impact” value of the new recoded variable. STEP 2: RECODING YOUR VARIABLES 2.4 RECODING THE CONTROL VARIABLE
media? Based on their response, we can’t really tell whether they are impacted or not by social media. Therefore, we will not recode their responses. In this way, all cases with “I don’t know/ I don’t use social media” in the old variable will will have empty values in the new recoded variable. We will filter out these empty values later. Photo: Startup Weekend Hackathon. Nov.2014 STEP 2: RECODING YOUR VARIABLES 2.4 RECODING THE CONTROL VARIABLE
Weekend Hackathon. Nov.2014 STEP 2: RECODING YOUR VARIABLES 1. Once recoded, drag and drop the new variable next to the original variable. 2. Sort the dataset Ascending using the original variable (right click on the original variable name > Sort Ascending). 3. In this way you can easily check if the old variable was correctly recoded into the new one.
Then, run a quick Descriptive Statistics on the new variable to check once again that the control variable was properly recoded. 2.4 CHECKING IF THE VARIABLE WAS PROPERLY RECODED
the variable contains 3 values. “Social media have impact”, “Social media do not have impact” and blank. The Blank value includes missing responses and the “I don’t know/ I don’t have social media” responses that we have not recoded. Blank responses (the first row) must be removed. We’ll clean the data later, in step 3. Photo: Startup Weekend Hackathon. Nov.2014 STEP 2: RECODING YOUR VARIABLES OK OK 2.2 CHECKING IF THE VARIABLE WAS PROPERLY RECODED
(two values only). 2. If you work with numeric variables, rely on one of the two strategies discussed in Lab 7 to decide how to divide the values of the old variable into 2 values in the new variable. These strategies are: a. Using a measure of centrality as threshold value (median or mean) b. Relying on an arbitrary rule (as long as it is consistent with your hypothesis) 3. Include at least 1 variable from student submitted questions (variable #52 - #138). 4. If one of your variables is already binary, there is no need to recode. 5. Do not use the same combination of variables of this example. Photo: Startup Weekend Hackathon. Nov.2014 STEP 2: RECODING YOUR VARIABLES NOTES
the previous slides, all recoded variables have missing or empty values. We now need to remove empty values from all variables. To do so, click on Data in the top bar and then Select Cases. Photo: Startup Weekend Hackathon. Nov.2014 STEP 3: CLEANING YOUR VARIABLES
undesired values from a variable. This is achieved by writing a simple logical statement in the Select cases window. This statement determines which cases SPSS will use in all future calculations. Photo: Startup Weekend Hackathon. Nov.2014 STEP 3: CLEANING YOUR VARIABLES
1. Add the variable you want to use as a filter 2. Specify the conditions for inclusion. Separate multiple conditions with the command “or” or “And”. Pay attention to quotes, spaces, capitalizations. @2718068_New = 'Has political opinion' OR @2718068_New = 'Does not have political opinio'
you have to clean only one variable. However, if more than one of your variables include unwanted values (in other words, if more than one of your variable is not in binary form), then you need to use a more complex formula to remove all unwanted values from all variables at once. The formula is the following: (VARIABLE1 = 'VALUE1' OR VARIABLE1 = 'VALUE2’) AND (VARIABLE2 = 'VALUE1' OR VARIABLE2 = 'VALUE2') AND (VARIABLE3 = 'VALUE1' OR VARIABLE3 = 'VALUE2') Where VARIABLE1 is your independent variable, VARIABLE2 is your dependent variable and VARIABLE3 is your control variable, and VALUE1 and VALUE2 are the respective values you want to include for each variable. Photo: Startup Weekend Hackathon. Nov.2014 STEP 3: CLEANING YOUR VARIABLES
the inclusion formula is the following: (@2718068_New = 'Has political opinion' OR @2718068_New = 'Does not have political opinio') AND (@2864544_New = 'More likely to investigate new' OR @2864544_New = 'Less likely to investigate new') AND (@2864568_New = 'Social media do not have impact' OR @2864568_New = 'Social media have impact') Photo: Startup Weekend Hackathon. Nov.2014 STEP 3: CLEANING YOUR VARIABLES
part of Assignment 3 is done. If you successfully recoded and cleaned your variables, you are 80% done. What is left is the fun part: running the analyses. STEP 3: CLEANING YOUR VARIABLES
and drop your three variables (recoded and cleaned) in the box on the right. Select Statistics and choose the main measures of centrality and dispersion (mean, mode, median, std deviation, range, min and max). Repeat this step for all your 3 variables.
Frequency tables should look like these. If some of your tables have more than 2 rows (plus the total row), then it means you have not cleaned the data. Go back to step 3 and control the logical statements for inclusion.
your variables are strings (text), your statics table should look like this (remember, you cannot calculate centrality and dispersion for textual variables): If your variables are numeric, your statics table should look like this:
you really want to go the extra mile, add charts to your analysis. If you do, keep in mind the data visualization best practices described in SPSS Lab 5
crosstab should look like this. If you have more than 2 rows for the independent variable or more than 2 columns for the dependent variable (excluding the Total), then it means you have not properly cleaned your variables. In other words, you have more than 2 values in the independent or dependent variable. Go back to Step 3 and repeat the cleaning procedure.
discuss your findings in words. Describe the patterns you found and discuss whether or not these patterns support or do not support your hypothesis. To learn how to interpret a crosstab, see SPSS Lab 6 slides on Canvas.
with a political affiliation declared to investigate news sources, while only 64.7% of people without political affiliation declared doing it. Since the difference is greater than 10% (16.6%), we can conclude that there is a significant relationship between the independent variable and the dependent variable. People with political affiliations are more likely to check news sources more often than people without political affiliation.
Independent variable (recoded and cleaned) in the Rows 2. Dependent variable (recoded and cleaned) in the Columns 3. Control variable (recoded and cleaned) in the Layer box 4. Click Cells
Controlling for social media impact reveals a complex scenario. As already seen in the previous bivariate analysis, the zero order relationship shows a significant relation between political views and tendency to investigate news sources. This relation is even stronger in the case of people who declared that social media have a significant impact on their media consumption. This can be seen in the % difference between people with political affiliation and people without political affiliation (20.6%). While in the case of people who declared that social media do not play a role in their media consumption, there is no relation between dependent variable and independent variable. People with political views and without political views are equally likely to investigate news sources further (75% of them declare doing it)
7 slides for instructions on how to read a multivariate analysis crosstable. If you want to go the extra mile, and if possible, try to describe the relation between variables using the models described in Prof.Al-Rawi Week 8 lecture (specification, interpretation, explanation, replication, suppressor variable, distorter variable).