# Chi-Square test

Load Chi^{2}test data set

The Chi-square test is a hypothesis test used to determine whether there is a relationship between two categorical variables.

What are categorical variables? Categorical variables are, for example, a person's
gender, preferred newspaper, frequency of television viewing, or their highest level of
education. So whenever you want to test whether there is a relationship between two
categorical variables, you use a Chi^{2} test.

##### Definition:

The **chi-square test** is a hypothesis test used for categorical variables with
nominal or ordinal
measurement scale. The chi-square
test checks whether the frequencies occurring in the sample differ significantly
from the frequencies one would expect. Thus, the observed frequencies are compared
with the expected frequencies and their deviations are examined.

Let's say we want to investigate whether there is a connection between gender and the highest level of education. To do this, we create a questionnaire in which the participants tick their gender and what their highest educational level is. The result of the survey is then displayed in a contingency table.

The Chi-square test is used to investigate whether there is a relationship between gender and the highest level of education.

### Null hypothesis and alternative hypothesis

The null hypothesis and the alternative hypothesis then result in:

**Null hypothesis:** there is no relationship between gender and highest
educational attainment.

**Alternative hypothesis:** There is a relation between gender and the highest
educational attainment.

**Tip:** On DATAtab you can calculate the Chi-square test online. Simply visit the
Chi-Square Test Calculator.

## Applications of the Chi-Square Test

There are various applications of the Chi-square test, it can be used to answer the following questions:

##### 1) Independence test

Are two categorical variables independent of each other? For example, does gender have an impact on whether a person has a Netflix subscription or not?

##### 2) Distribution test

Are the observed values of two categorical variables equal to the expected values? One question could be, is one of the three video streaming services Netflix, Amazon, and Disney subscribed to above average?

##### 3) Homogeneity test

Are two or more samples from the same population? One question could be whether the subscription frequencies of the three video streaming services Netflix, Amazon and Disney differ in different age groups.

card>## Calculate chi-squared

The chi-squared value is calculated via:

To clarify the calculation of the chi-squared value, we refer to the following case: for
variables *one* and *two* with category *A* and *B*, an observation
was made or a sample exists. Now we want to check whether the frequencies from the
sample correspond to the expected frequencies from the population.

##### Observed frequency:

Category A | Category B | |
---|---|---|

Category A | 10 | 13 |

Category B | 13 | 14 |

##### Expected frequency:

Category A | Category B | |
---|---|---|

Category A | 9 | 11 |

Category B | 12 | 13 |

With the upper equation you can now calculate **chi-squared**:

After calculating chi-squared the number of degrees of freedom *df* is needed. This
is given by

with

*p*: number of lines*q*: number of columns

From the
table of the chi-squared distribution
one can now read the critical chi-squared value. For a significance level of 5 % and a
*df* of 1, this results in 3.841. Since the calculated chi-squared value is
smaller, there is no significant difference.

As a **prerequisite** for this test, please note that all expected frequencies must
be greater than 5.

## Chi-Square Test of Independence

The Chi-Square Test of Independence is used when two categorical variables are to be tested for independence. The aim is to analyze whether the characteristic values of the first variable are influenced by the characteristic values of the second variable and vice versa.

For example, does gender have an influence on whether a person has a Netflix subscription or not? For the two variables gender (male, female) and has Netflix subscription (yes, no), it is tested whether they are independent. If this is not the case, there is a relationship between the characteristics.

The research question that can be answered with the Chi-square test is: Are the
characteristics of *gender* and
*ownership of a Netflix subscription* independent of each other?

In order to calculate the chi-square, an observed and an expected frequency must be given. In the independence test, the expected frequency is the one that results when both variables are independent. If two variables are independent, the expected frequencies of the individual cells are obtained with

where *i* and *j* are the rows and columns of the table respectively.

For the fictitious Netflix example, the following tables could be used. On the left is the table with the frequencies observed in the sample, and on the right is the table that would result if perfect independence existed.

##### Observed frequency:

Male | Female | |
---|---|---|

Netflix Yes | 10 | 13 |

Netflix No | 15 | 14 |

##### Expected frequency if independent:

Male | Female | |
---|---|---|

Netflix Yes | (23 · 25) / 52 = 11.06 | (23 · 27) / 52 = 11.94 |

Netflix No | (29 · 25) / 52 = 13.94 | (29 · 27) / 52 = 15.06 |

The Chi-square is then calculated as

From the Chi-square table you can now read the critical value again and compare it with the result.

The assumptions for the Chi-square independence test are that the observations are from a random sample and that the expected frequencies per cell are greater than 5.

## Chi-square distribution test

If a variable is present with two or more values, the differences in the frequency of the individual values can be examined.

The **Chi-square distribution test**, or **Goodness-of-fit test**, checks whether
the frequencies of the individual characteristic values in the sample correspond to the
frequencies of a defined distribution. In most cases, this defined distribution
corresponds to that of the population. In this case, it is tested whether the sample
comes from the respective population.

For market researchers it could be of interest whether there is a difference in the market penetration of the three video streaming services Netflix, Amazon and Disney between Berlin and the whole of Germany. The expected frequency is then the distribution of streaming services throughout Germany and the observed frequency results from a survey in Berlin. In the following tables the fictitious results are shown:

##### Observed frequency in Berlin:

Video Service | Frequency |
---|---|

Netflix | 25 |

Amazon | 29 |

Disney | 13 |

Others or none | 20 |

##### Expected frequency (all Germany):

Video Service | Frequency |
---|---|

Netflix | 23 |

Amazon | 26 |

Disney | 16 |

Other or none | 22 |

The Chi-square then results in

## Chi-square homogeneity test

The Chi-square homogeneity test can be used to check whether two or more samples come from the same population? One question could be whether the subscription frequency of three video streaming services Netflix, Amazon and Disney differ in different age groups. As a fictitious example, a survey is made in three age groups with the following result:

##### Observed frequency:

Age | 15-25 | 25-35 | 35-45 |
---|---|---|---|

Netflix | 25 | 23 | 20 |

Amazon | 29 | 30 | 33 |

Disney | 11 | 13 | 12 |

Other or none | 16 | 24 | 26 |

As with the Chi-square independence test, this result is compared with the table that would result if the distributions of Streaming providers were independent of age.

## Effect size in the Chi-square test

So far we only know whether we can reject the null hypothesis or not, but it is very often of great interest to know how strong the relationship between the two variables is. This can be answered with the help of the effect size.

In the Chi-square test, Cramér's V can be used to calculate the effect size. Here a value of 0.1 is small, a value of 0.3 is medium and a value of 0.5 is large. DATAtab will of course calculate the effect size for you very easily.

Effect size | Cramér’s V |
---|---|

Small | 0.1 |

Medium | 0.3 |

Large | 0.5 |

### Effect size vs. p-value

Please note that the p-value does not tell you anything about the strength of the correlation or the effect and depends on the sample size! The following points should therefore be considered:

- If there is a correlation in the population, the larger the sample, the more clearly this will be shown in the p-value.
- If the sample is very large, very small correlations can also be detected in the population.
- These small correlations may no longer be relevant under certain circumstances.

Therefore, if there is a small sample and a large sample and there is an equally large effect in both samples, the p-values would still differ. The larger the sample, the smaller the p-value and therefore even very small correlations can be confirmed with a very large sample.

This is where the effect size plays an important role. With the effect size in the Chi-square test, differences can be made comparable across several studies.

## Example chi-squared test

### Independence test

As an example of a chi-squared test where independence is tested, we consider the use of umbrellas. On a rainy day we counted how many women and how many men come to university with an umbrella.

Gender | Umbrella present |
---|---|

female | yes |

male | yes |

female | yes |

female | yes |

male | yes |

male | no |

female | no |

male | no |

female | no |

female | no |

male | no |

female | yes |

male | yes |

female | yes |

male | yes |

male | yes |

male | no |

female | no |

male | no |

female | no |

female | no |

female | no |

##### Question:

Is the difference in the use of an umbrella for women and men statistically significant or random?

This is how it works in the online statistics calculator: After you have copied the
above table into the
hypothesis test calculator,
you can calculate the chi-squared test. To do this, simply click on the two variables
*Gender* and *Umbrella*. As a result, you will get the (1) contingency table,
the (2) expected frequency for perfectly independent variables and the (3) chi-squared
test

Umbrella present | ||||
---|---|---|---|---|

yes | no | Total | ||

Gender | female | 5 | 7 | 12 |

male | 5 | 5 | 10 | |

Total | 10 | 12 | 22 |

Expected frequencies for perfectly independent variables:

Umbrella present | ||||
---|---|---|---|---|

yes | no | Total | ||

Gender | female | 5.455 | 6.545 | 12 |

male | 4.545 | 5.455 | 10 | |

Total | 10 | 12 | 22 |

Chi-squared test | |
---|---|

Chi-squared | 0.153 |

df | 1 |

p value | 0.696 |

With a significance level of 5% and a degree of freedom of 1, the table of chi-squared values gives a critical value of 3.841. Since the calculated chi-squared value is smaller than the critical value, there is no significant difference in this example and the null hypothesis is not rejected. In terms of content, this means that men and women do not differ in the frequency of their umbrella use.

### Distribution test

In one district of Vienna, the party membership of 22 persons was recorded. Now it is to be examined whether the residents of the district (random sample) have the same voting behaviour as the residents of the entire city of Vienna (population).

Party |
---|

Party A |

Party C |

Party A |

Party C |

Party A |

Party C |

Party B |

Party B |

Party C |

Party A |

Party C |

Party A |

Party A |

Party B |

Party B |

Party A |

Party A |

Party B |

Party A |

Party A |

Party C |

Party C |

To calculate the chi-squared test for the example, simply copy the upper table into the Hypothesis Test Calculator.

*Party A* has a 40% share in Vienna and *party C* has 35%. You will therefore
now receive the following results:

Category | n | Observed Probability | Expected Probability | |
---|---|---|---|---|

Party | Party A | 10 | 45.455% | 40% |

Party C | 7 | 31.818% | 35% | |

Party B | 5 | 22.727% | ||

Total | 22 | 100% |

Chi-squared test | |
---|---|

Chi-squared | 0.264 |

df | 2 |

p | 0.876 |

If the significance level is set at 0.05, the p-value calculated at 0.876 is greater than the significance level. Thus, the null hypothesis is not rejected and it can be assumed that the residents of the district have the same voting behavior as the residents of the entire city of Vienna.

### Statistics made easy

- many illustrative examples
- ideal for exams and theses
- statistics made easy on 412 pages
- 5rd revised edition (April 2024)
**Only 8.99 €**

*"Super simple written"*

*"It could not be simpler"*

*"So many helpful examples"*