Lesson 13: Chi-Square Tests
- Perform and interpret a chi-square test of goodness of fit.
- Perform and interpret a chi-square test of independence.
Chi-square tests are used to compare observed frequencies to the frequencies expected under some hypothesis. Tests for one categorical variable are generally called goodness-of-fit tests. In this case, there is a one-way table of observed frequencies of the levels of some categorical variable. The null hypothesis might state that the expected frequencies are equally distributed or that they are unequal on the basis of some theoretical or postulated distribution.
Tests for two categorical variables are usually called tests of independence or association. In this case, there will be a two-way contingency table with one categorical variable occupying rows of the table and the other categorical variable occupying columns of the table. In this analysis, the expected frequencies are commonly derived on the basis of the assumption of independence. That is, if there were no association between the row and column variables, then a cell entry would be expected to be the product of the cell's row and column marginal totals divided by the overall sample size.
In both tests, the chi-square test statistic is calculated as the sum of the squared differences between the observed and expected frequencies divided by the expected frequencies, according to the following simple formula:
where O represents the observed frequency in a given cell of the table and E represents the corresponding expected frequency under the null hypothesis.
We will illustrate both the goodness-of-fit test and the test of independence using the same dataset. You will find the goodness of fit test for equal or unequal unexpected frequencies as an option under Nonparametric Tests in the Analyze menu. For the chi-square test of independence, you will use the Crosstabs procedure under the Descriptive Statistics menu in SPSS. The cross-tabulation procedure can make use of numeric or text entries, while the Nonparametric Test procedure requires numeric entries. For that reason, you will need to recode any text entries into numerical values for goodness-of-fit tests.
Assume that you are interested in the effects of peer mentoring on student academic success in a competitive private liberal arts college. A group of 30 students is randomly selected during their freshman orientation. These students are assigned to a team of seniors who have been trained as tutors in various academic subjects, listening skills, and team-building skills. The 30 selected students meet in small group sessions with their peer tutors once each week during their entire freshman year, are encouraged to work with their small group for study sessions, and are encouraged to schedule private sessions with their peer mentors whenever they desire. You identify an additional 30 students at orientation as a control group. The control group members receive no formal peer mentoring. You determine that there are no significant differences between the high school grades and SAT scores of the two groups. At the end of four years, you compare the two groups on academic retention and academic performance. You code mentoring as 1 = present and 0 = absent to identify the two groups. Because GPAs differ by academic major, you generate a binary code for grades. If the student's cumulative GPA is at the median or higher for his or her academic major, you assign a 1. Students whose grades are below the median for their major receive a zero. If the student is no longer enrolled (i.e., has transferred, dropped out, or flunked out), you code a zero for retention. If he or she is still enrolled, but has not yet graduated after four years, you code a 1. If he or she has graduated, you code a 2.
You collect the following (hypothetical) data:
Properly entered in SPSS, the data should look like the following (see Figure 13-1). For your convenience, you may also download a copy of the dataset.
Figure 13-1 Dataset in SPSS (partial data)
Conducting a Goodness-of-Fit Test
To determine whether the three retention outcomes are equally distributed, you can perform a goodness-of-fit test. Because there are three possible outcomes (no longer enrolled, currently enrolled, and graduated) and sixty total students, you would expect each outcome to be observed in 1/3 of the cases if there were no differences in the frequencies of these outcomes. Thus the null hypothesis would be that 20 students would not be enrolled, 20 would be currently enrolled, and 20 would have graduated after four years. To test this hypothesis, you must use the Nonparametric Tests procedure. To conduct the test, select Analyze, Nonparametric Tests, Chi-Square as shown in Figure 13-2.
Figure 13-2 Selecting chi-square test for goodness of fit
In the resulting dialog box, move Retention to the Test Variable List and accept the default for equal expected frequencies. SPSS counts and tabulates the observed frequencies and performs the chi-square test (see Figure 13-3). The degrees of freedom for the goodness-of-fit test are the number of categories minus one. The significant chi-square shows that the freqencies are not equally distributed, χ2 (2, N = 60) = 6.10, p = .047.
Figure 13-3 Chi-square test of goodness of fit
Conducting a Chi-Square Test of Independence
If mentoring is not related to retention, you would expect mentored and non-mentored students to have the same outcomes, so that any observed differences in frequencies would be due to chance. That would mean that you would expect half of the students in each outcome group to come from the mentored students, and the other half to come from the non-mentored students. To test the hypothesis that there is an association (or non-independence) between mentoring and retention, you will conduct a chi-square test as part of the cross-tabulation procedure. To conduct the test, select Analyze, Descriptive Statistics, Crosstabs (see Figure 13-4).
Figure 13-4 Preparing for the chi-square test of independence
In the Crosstabs dialog, move one variable to the row field and the other variable to the column field. I typically place the variable with more levels in the row field to keep the output tables narrower (see Figure 13-5), though the results of the test would be identical if you were to reverse the row and column variables.
Figure 13-5 Establishing row and column variables
Clustered bar charts are an excellent way to compare the frequencies visually, so we will select that option (see Figure 13-5). Under the Statistics option, select chi-square and Phi and Cramer's V (measures of effect size for chi-square tests). You can also click on the Cells button to display both observed and expected cell frequencies. The format menu allows you to specify whether the rows are arranged in ascending (the default) or descending order. Click OK to run the Crosstabs procedure and conduct the chi-square test.
Figure 13-6 Partial output from Crosstabs procedure
For the test of independence, the degrees of freedom are the number of rows minus one multiplied by the number of columns minus one, or in this case 2 x 1 = 2. The Pearson Chi-Square is significant, indicating that mentoring had an effect on retention, χ2 (2, N = 60) = 14.58, p < .001. The value of Cramer's V is .493, indicating a large effect size (Gravetter & Walnau, 2005).
The clustered bar chart provides an excellent visual representation of the chi-square test results (see Figure 13-7).
Figure 13-7 Clustered bar chart
For additional practice, you can use the Nonparametric Tests and Crosstabs procedures to determine whether grades differ between mentored and non-mentored students and whether there is an association between grades and retention outcomes.
Gravetter, F. J., & Walnau, L. B. (2005). Essentials of statistics for the behavioral sciences (5th ed.). Belmont, CA: Thomson/Wadsworth.