Paired sample t-test
What is the t-test for dependent samples?
The paired samples t-test is a statistical test that determines whether there is a difference between two dependent groups or samples.
The paired samples t-test, or also known as the dependent t-test, tests whether the mean values of two dependent groups differ significantly from each other.
Why do you need the paired t-Test?
You need the paired t-test whenever you survey the same group or sample at two points in time. For example, you might be interested in whether a rehabilitation program has a positive effect on physical fitness. Since you can't ask all the people who go to rehab, you use a random sample. You can then use the paired t-test to infer the population from the sample.
What are dependent or paired samples?
In dependent samples, these measured values are available in pairs. The pairs result from repeated measurements, parallelization or matching. This can be the case, for example, in longitudinal studies with several measurement points (time series analyses) or in intervention studies with experimental designs (before-after measurement).
An example of dependent sampling is when the weight of a group of people is measured at two points in time. A person can then be uniquely assigned a weight at the first and second measurement time points and the difference in the measured values can be calculated in each case. If more than two measurement times are available, ANOVA with repeated measures is used.
What is the advantage of a dependent t-test over an independent t-test?
The question of whether to use a dependent t-test or an independent t-test is, of course, already determined as part of the study design, and it is not possible to arbitrarily use either one test or the other. Therefore, the question is rather which type of study makes more sense:
- Conducting a study with one group of participants who are measured twice.
- To conduct a study with two separate groups of participants, each measured once.
The major advantage of a repeated-measures design that then uses the dependent t-test is that individual differences between participants can be eliminated. This means that the probability of detecting a (statistically significant) difference, if one exists, is higher with the dependent t-test than with the independent t-test.
Example of a paired sample t-test
The t-test for dependent samples has numerous applications, here are three examples.
Medical example:
You want to check whether a new drug increases memory performance. You will test the memory performance of 40 people before and after they take the medicine.
Technical example:
A screw factory complains about very high downtimes at its 5 production plants. You are now to find out whether a newly introduced lubricant has an influence on the downtimes. For this you compare the downtimes of the 5 plants before and after the introduction of the new lubricant.
Social science example:
You want to find out if there is a change between 2010 and 2015 in terms of health consciousness of the German population. For example, you could do this rely on data from the Socio-Economic Panel (SOEP). The SOEP is a representative repeat survey of private households in Germany. The survey will include always asks the same people at regular intervals about the same topics. To get your question to answer you compare the health consciousness of the persons in the year 2010 and 2015.
Research Question and hypotheses
In order to be able to calculate a t-test for dependent samples, you first need to define a question and the hypotheses
Research Question
In a t-test for dependent samples, the general question is: Is there a statistically significant difference between the mean value of two dependent groups?
The questions for the above examples arise as follows:
- Does the new drug help to increase memory performance?
- Does the newly introduced lubricant have an influence on downtimes?
- Has the health consciousness of the German population changed between 2010 and 2015?
Hypotheses
Now the hypothesis can be derived from the question. In the hypothesis, a preliminary, i.e. unsubstantiated, assumption is made which is to be tested. In the case of a t test for dependent samples are the hypotheses:
- Null hypothesis H_{0}: The mean value of the two dependent groups is equal.
- Alternative hypothesis H_{1}: The mean values of the two dependent groups are different.
Assumptions paired t-Test
Of course, the prerequisites must be checked before calculating the dependent t-test. If the prerequisites 2. and 3. are not fulfilled, the Wilcoxon test must be used. The Wilcoxon test is the non-parametric counterpart of the paired t-test.
1. There are two dependent groups or samples
As the name paired t-test already suggests the groups must be dependent, i.e. a value of one group must belong to a value of the other group.
- The weight of one and the same person is measured before and after a diet.
- Researchers measure the weight of people who have been on a diet and people who have not.
2. The variables must be interval scaled
In the t-test for dependent samples, the difference between the two dependent values is calculated and then the mean value. This only makes sense if the values are metric
- The salary of a person (in Euro)
- The educational level of a person
3. The differences of the paired values are normally distributed.
The difference between the paired values must be normally distributed.
- The difference from the weight of one person at two points in time.
- The difference in the number of points after throwing two dice.
How does a dependent t-test works?
In the dependent t-test, the difference is calculated from each paired case. The mean value is then calculated from these differences. Depending on how large the mean value is and how large the standard error of the mean value is, a statement is then made as to how likely it is that this result has arisen by chance.
Calculate t-test for dependent samples
For the calculation of the t-test for dependent samples, the difference of each pair from the two groups is first formed. From the resulting differences, the mean value x̄_{diff} is then calculated.
The calculation of the test statistics t is now equal to the t test for one sample. If there is no difference between the two groups, the mean value of the difference x̄_{diff} is zero. So the question is, is there a difference between x̄_{diff} and zero. The test statistic t for the t-test for dependent samples is then calculated as
where is the standard error of the mean value
- = Difference between the groups
- = Mean value of the difference between the two groups
- = Sample size
- = Standard deviation
- = Estimated standard error of mean
Effect size dependent t-test
The indication of the effect size is very important for empirical studies. To make a statement about the effect size in a t-test for dependent samples, the following formula can be used
In general, it can be said about the effect size:
- Effect size r: 0,2 small Effect
- Effect size r: 0,5 medium Effect
- Effect size r: 0,8 large Effect
Calculate with DATAtab
In the paired t-test example it is examined whether the summer holidays have an effect on the physical fitness of students.
So the question is: Do the summer holidays have an effect on the physical fitness of statistics students? For this purpose, a fitness test is carried out once before and once after the holidays for 10 statistics students (2 measurement points).
Null hypothesis H0
The average difference of the measured value pairs (before and after the holidays) is zero. The semester break has no influence on the physical fitness of the students.
Since two test results always come from one student, there is a dependency between the two samples. Therefore the paired t-test is used.
Statistics student | Points before holidays | Points after holidays |
---|---|---|
1 | 60 | 61 |
2 | 70 | 71 |
3 | 40 | 38 |
4 | 41 | 39 |
5 | 40 | 38 |
6 | 40 | 33 |
7 | 45 | 55 |
8 | 48 | 56 |
9 | 30 | 38 |
10 | 50 | 68 |
After copying the upper table into the t-test calculator you can calculate the t-test. The results are as follows:
Statistics
n | Mean value | Standard deviation | Standard error of the mean | |
Points before holidays | 10 | 46.4 | 11.452 | 3.622 |
Points after holidays | 10 | 49.7 | 14.095 | 4.457 |
Correlation
n | Correlation | |
before holidays - after holidays | 10 | 0,847 |
Paired t-test
t | df | p-value (2-tailed) | |
before holidays - after holidays | -1.39 | 9 | 0,197 |
95% confidence interval of the difference
Mean | Standard deviation | Standard error of the mean | Lower | Upper | |
---|---|---|---|---|---|
Points before holidays Points after holidays |
-3.3 | 7.5 | 2.37 | -8.66 | 2.06 |
Interpret t-test for dependent samples
If the calculated p-value is smaller than the specified significance level (usually 5%), the null hypothesis is rejected, otherwise it is retained. For the upper example, you can report the results as follows:
The score of the variable before the vacations had lower values (M = 46.4, SD = 11.452) than the score of the variable after the vacations (M = 49.7, SD = 14.095). A t-test for dependent samples showed that this difference was not statistically significant, t(9) = -1.392 p = .197, 95% confidence interval [-8.664, 2.064].
This results in a p-value of 0.197 which is above the defined significance level of 0.05. The t-test result is therefore not significant and the null hypothesis is maintained. It is therefore assumed that both samples are from the same population.
Statistics made easy
- Many illustrative examples
- Ideal for exams and theses
- Statistics made easy on 251 pages
- Only 6.99 €
"Super simple written"
"It could not be simpler"
"So many helpful examples"