Two Way ANOVA (without repeated measures)
What is a two-way ANOVA?
Two-way (or two factor) analysis of variance tests whether there is a difference between more than two independent samples split between two variables or factors.
What is a factor?
A factor is, for example, the gender of a person with the characteristics male and female, the form of therapy used for a disease with therapy A, B and C or the field of study with, for example, medicine, business administration, psychology and math.
In the case of variance analysis, a factor is a categorical variable. You use an analysis of variance whenever you want to test whether these categories have an influence on the so-called dependent variable.
For example, you could test whether gender has an influence on salary, whether therapy has an influence on blood pressure, or whether the field of study has an influence on the duration of studies. Salary, blood pressure and study duration are then the dependent variables. In all these cases you now check whether the factor has an influence on the dependent variable.
Since you only have one factor in these cases, you would use a single factor analysis of variance in these cases (except of course for the gender, there we have a variable with only two expressions, there we would use the t-test for independent samples).
Two factors
Now you may have another categorical variable that you want to include as well. You might be interested in whether:
- in addition to gender, the highest level of education also has an influence on salary.
- besides therapy, gender also has an influence on blood pressure.
- in addition to the field of study, the university attended also has an influence on the duration of studies
Now in all three cases you would not have one factor, but two factors each. And since you now have two factors, you use the two-factor analysis of variance.
Using the two-factor analysis of variance, you can now answer three things:
- Does factor 1 have an effect on the dependent variable?
- Does factor 2 have an effect on the dependent variable?
- Is there an interaction between factor 1 and factor 2?
Therefore, in the case of one-factor analysis of variance, we have one factor from which we create the groups. In the case of two-factor analysis of variance, the groups result from the combination of the expressions of the two factors.
Hypotheses
Three statements can be tested with the 2 factorial ANOVA, so there are 3 null hypotheses and therefore 3 alternative hypotheses.
Null hypotheses H_{0} | Alternative hypotheses H_{1} |
---|---|
There are no significant differences in the mean between the groups (factor levels) of the first factor. | There is a significant difference in the mean between the groups (factor levels) of the first factor. |
There are no significant differences in the mean between the groups (factor levels) of the second factor. | There is a significant difference in the mean between the groups (factor levels) of the second factor. |
One factor has no effect on the effect of the other factor. | One factor has an effect on the effect of the other factor. Prerequisites |
Assumptions
For a two-factor analysis of variance to be calculated without repeated measures, the following assumptions must be met:
- The scale level of the dependent variable should be metric, that of the independent variable (factors) nominal scale.
- Independence: The measurements should be independent, i.e. the measured value of one group should not be influenced by the measured value of another group. If this were the case, we would need an analysis of variance with repeated measures.
- Homogeneity: The variances in each group should be approximately equal. This can be checked with Levene's test.
- Normal distribution: The data within the groups should be normally distributed.
So the dependent variable could be, for example, salary, blood pressure, and study duration. These are all metric variables. And the independent variable should be nominally or ordinally scaled. For example, gender, highest level of education, or a type of therapy. Note, however, that rank order is not used with ordinal variables, so this information is lost.
Calculate two-factor ANOVA
To calculate a two-way ANOVA, the following formulas are needed. Let's look at this with an example.
Let's say you work in the marketing department of a bank and you want to find out if gender and the fact that a person has studied or not have an influence on their attitude towards retirement planning.
In this example, your two independent variables (factors) are gender (male or female) and study (yes or no). Your dependent variable is attitude toward retirement planning, where 1 means "not important" and 10 means "very important."
After all, is attitude toward retirement planning really a metric variable? Let's just assume that attitude toward retirement planning was measured using a Likert scale and thus we consider the resulting variable to be metric.
Mean values
In the first step we calculate the mean values of the individual groups, i.e. of male and not studied, which is 5.8 then of male and studied, which is 5.4, we now do the same for female.
Then we calculate the mean of all male and female and of not studied and studied respectively. Finally, we calculate the overall mean as 5.4.
Sums of squares
With this, we can now calculate the required sums of squares. SS_{tot} is the sum of squares of each individual value minus the overall mean.
SS_{btw} results from the sum of squares of the group means minus the overall mean multiplied by the number of values in the groups.
The sums of squares of the factors SS_{A} and SS_{B} result from the sum of squares of the means of the factor levels minus the total mean.
Now we can calculate the sum of squares for the interaction. These are obtained by calculating SS_{btw} minus SS_{A} minus SS_{B}.
Finally, we calculate the sum of squares for the error. This will calculate similar to the total sum of squares, so again we use each individual value. Only in this case, instead of subtracting the overall mean from each value, we subtract the respective group mean from each value.
Degrees of freedom
The required degrees of freedom are as follows:
Mean squares or variance
Together with the sums of squares and the degrees of freedom, the variance can now be calculated:
F value
And now we can calculate the F values. These are obtained by dividing the variance of factor A, factor B or the interaction AB by the error variance.
p-value
To calculate the p-value, we need the F-value, the degrees of freedom and the F-distribution. We use the F-distribution p-value calculator on DATAtab. Of course, you can also just calculate the example completely with DATAtab, more about that in the next section.
This gives us a p-value of 0.323 for Factor A, a p-value of 0.686 for Factor B, and a p-value of 0.55 for the interaction. None of these p-values is less than 0.05 and thus we retain the respective null hypotheses.
Calculating two factorial ANOVA with DATAtab
Calculate the example directly with DATAtab for free:
Load ANOVA data setWe take the same example from above. The data is now arranged in the form so that your statistics software can do something with it. In each row is a respondent.
Attitude towards retirement planning | Studied | Gender |
---|---|---|
6 | no | male |
4 | no | male |
5 | no | female |
... | ... | ... |
5 | yes | female |
9 | yes | female |
2 | yes | female |
3 | yes | female |
This example consists of only 20 cases, which of course is not much, giving us very low test power, but as an example it should fit.
To calculate a two factorial analysis of variance online, simply visit datatab.com and copy your own data into this table.
Then click on hypothesis testing. Under this tab you will find a lot of hypothesis tests and depending on which variable you click on, you will get an appropriate hypothesis test suggested.
When you copy your data into the table, the variables appear under the table, if the correct scale level is not automatically detected, you can simply change it under Variable View.
We want to know if gender and whether you have studied or not has an impact on your attitude towards retirement planning. So we just click on all three variables.
DATAtab will now automatically calculate a two-factor analysis of variance without repeated measures. DATAtab outputs the three null and the three alternative hypotheses, then the descriptive statistics and the Levene test of equality of variance. With the Levene test you can check if the variances within the groups are equal. The p-value is greater than 0.05, so we assume equality of variance within groups for these data.
Next come the results of the two factorial ANOVA.
Interpreting Two Factorial ANOVA
Most important in this table are the three labeled rows, with these three rows, you can test whether the 3 null hypotheses we made earlier are kept or rejected.
The first row tests you null hypothesis, whether studied or not studied has an effect on attitude towards retirement planning, the second row tests whether gender has an effect on attitude and the third row tests, the interaction between studied and gender.
You can read the p-value in each case right at the back here. Let's say we set the significance level at 5%. If our calculated p-value is less than 0.05, then the null hypothesis is rejected, and if the calculated p-value is greater than 0.05, the null hypothesis is retained.
Thus, in this case, we see that all three p-values are greater than 0.05 and thus we cannot reject any of the three null hypotheses.
Therefore, neither whether one has studied or not nor gender has a significant effect on attitudes toward retirement planning. And there is also no significant interaction between studied and gender in terms of attitudes toward retirement planning.
If you don't know exactly how to interpret the results, you can also just click on Summary in Words. You can also check here whether the assumptions for the analysis of variance are met at all.
Interaction effect
But what exactly does interaction mean? Let us first have a look at this diagram.
The dependent variable is plotted on the y axis, in our example the attitude towards retirement provision. On the x axis, one of the two factors is plotted, let's just take gender. The other factor is represented by lines with different colors. Green is studied and red is not studied.
The endpoints of the lines are the mean values of the groups, e.g. male and not studied.
In this diagram, one can see that both gender and the variable of having studied or not have an influence on attitudes toward retirement planning. Females have a higher value than males and studied have a higher value than not studied.
But now finally to the interaction effects, for that we compare these two graphs.
In the first case, we said there is no interaction effect. If a person has studied, he has a value that is, say, 1.5 higher than a person who has not studied.
This increase of 1.5 is independent of whether the person is male or female.
It is different in this case, here studied persons also have a higher value, but how much higher the value is depends on whether one is male or female. If I am male, there is a difference of, let's say for example 0.5 and if I am female, there is a difference of 3.5.
So in this case we clearly have an interaction between gender and study because the two variables affect each other. It makes a difference how strong the influence from studying is depending on whether I am male or female.
In this case, we do have an interaction effect, but the direction still remains the same. So females have higher scores than males and studied have higher scores than non-studied.
Statistics made easy
- Many illustrative examples
- Ideal for exams and theses
- Statistics made easy on 251 pages
- Only 6.99 €
"Super simple written"
"It could not be simpler"
"So many helpful examples"