Friedman Test

The Friedman test is a non-parametric statistical test used for analyzing repeated measures data. It is mainly used when the assumptions of normality and homogeneity of variances are not met, making it a robust alternative to repeated measures ANOVA.

What is a dependent sample (repeated measure)? In a dependent sample, the measured values are connected. For example, if a sample is drawn of people who have knee surgery and these people are each surveyed before the surgery and one and two weeks after the surgery, it is a dependent sample. This is the case because the same person was interviewed at multiple time points.

Friedman Test vs. ANOVA with repeated measures

You might rightly say that the analysis of variance with repeated measures tests exactly the same thing, since it also tests whether there is a difference between three or more dependent samples?

That is correct, the Friedman test is the non-parametric counterpart of the analysis of variance with repeated measures. But what is the difference between the two tests?

The analysis of variance tests the extent to which the measured values of the dependent sample differ. The Friedman test, meanwhile, uses ranks rather than the actual measured values.

The point in time where a person has the highest value gets rank 1, the point in time with the second highest value gets rank 2 and the point in time with the smallest value gets rank 3. This is now done for all persons or for all rows. Afterwards the ranks of the single points of time are added up.

At the first time we get a sum of 7, at the second time we get a sum of 8 and at the third time we get a sum of 9. Now we can check how much these rank sums differ.

Why are ranks used? The big advantage is that if you don't look at the mean difference, but at the rank sum, the data doesn't have to be normally distributed.

Simplified, if your data are normally distributed, parametric tests are used. For more than two dependent samples, this is ANOVA with repeated measures.

If your data are not normally distributed, non-parametric tests are used. For more than two dependent samples, this is the Friedman test.

Hypotheses in the Friedman test

This brings us to the research question, which you can answer with the Friedman test. The research question is, is there a significant difference between more than two dependent groups? The null and alternative hypothesis are therefore:

Null hypothesis: there is no significant difference between the dependent groups.
Alternative hypothesis: there is a significant difference between the dependent groups.

Of course, as already mentioned, the Friedman test does not use the true values, but the ranks.

Friedman test example

You might be interested to know whether therapy after a herniated disc has an influence on the patient's perception of pain. For this purpose, you measure the pain sensation before the therapy, in the middle of the therapy and at the end of the therapy. Now you want to know if there is a difference between the different time points.

So, your independent variable is time, or the progress of the therapy over time. Your dependent variable is the perception of pain. You now have a progression of pain perception from each person over time and now you want to know if the therapy has an effect on the pain perception.

Put simply, in this one case the therapy has an influence and in this other case the therapy has no influence on the pain perception. In the course of time, the pain perception does not change in this case, and it does in that other one.

Calculate Friedman test

Let's say you want to investigate whether there is a difference in the responsiveness of people in the morning, at noon and in the evening. For this purpose, you measured the reactivity of 7 people in the morning, at noon and in the evening.

In the first step we have to assign ranks to the values. For this we look at each row separately.

In the first row, or in the first person, 45 is the largest value, this gets rank 1, then comes 36 with rank 2 and 34 with rank 3. We now do the same for the second row. Here 36 is the largest value and gets rank 1, then comes 33 with rank 2 and 31 with rank 3. We now do this for each row.

Afterwards we can calculate the rank sum for each time of the day, so we simply sum up all ranks at each column. In the morning we get 17, at noon 11 and in the evening 14.

If there were no difference between the different time points in terms of reaction time, we would expect the expected value at all time points. The expected value is obtained with the first equation on the image and in this case it is 14. So if there is no difference between morning noon and evening, we would actually expect a rank sum of 14 at all 3 time points.

Next we can calculate the Chi² value, we get it with the second ecuation on the image. N is the number of persons, i.e. 7, k is the number of time points, i.e. 3 and the sum of R² is 17² + 11² + 14². Thus we get a Chi² value of 2.57.

Now we need the number of degrees of freedom. This is given by the number of time points minus 1, so in our case 2.

At this point we can read the critical Chi² value in the critical values table. For this we take the predefined significance level, let's say it is 0.05 and the number of degrees of freedom. We can read that the critical Chi² value is 5.99. This is greater than our calculated value. Thus, the null hypothesis is not rejected and based on this data, there is no difference between the responsiveness at the different time points. If the calculated Chi² value were greater than the critical one, we would reject the null hypothesis.

Calculate Friedman test with DATAtab

For the calculation of the Friedman test you can simply use DATAtab. To do this, simply go to the Friedman test calculator on DATAtab and copy your own data into the table.

Now we get the results for the Friedman test.

First you get the descriptive statistics. Then you can read the p-value. If you don't know exactly how to interpret the p-value, you can simply look at the interpretation in words. A Friedman test showed that there is no significant difference between the variables. Chi² = 2.57, p = 0.276

If your p-value is greater than your set significance level, then your null hypothesis is not rejected. The null hypothesis is that there is no difference between the groups. Usually, a significance level of 0.05 is used, so this p-value is greater.

Post-Hoc Test

In addition, DATAtab provides you the post-hoc test. If your p-value is smaller than 0.05 you can examine here which of the groups really differ!

Here, two groups are considered in each row and the null hypothesis is tested whether both samples are the same, the "Adjusted p-value" is obtained by multiplying the p-value by the number of tests.

If the post-hoc test indicates that the p-value is less than 0.05, it is assumed that these groups are different.

Statistics made easy

many illustrative examples
ideal for exams and theses
statistics made easy on 412 pages
5rd revised edition (April 2024)
Only 7.99 €

Free sample

"Super simple written"

"It could not be simpler"

"So many helpful examples"