# Confidence Interval

## What is the confidence interval?

The confidence interval CI is the range in which a parameter (e.g. the mean value) lies with a certain probability.

If several samples are taken from a population, it is very likely that each sample will have a different mean value. However, we want to know the mean of the population, not the mean of the sample. The confidence interval is the range in which the true mean of the population lies with a certain probability.

Caution! The above definition is widely used because it is easy to understand, but not all experts agree that it is correct. The following definition is correct, but more complicated:

The 95% confidence interval (CI) is an interval calculated from sample data that’s one from an infinite sequence, 95% of which include the population parameter. In the long run, 95% of such intervals include the true mean.

## Why do we need the confidence interval?

In statistics, parameters of the population are often estimated based on a sample, such as the mean or the variance. However, these are only estimates and the true value in the population will be somewhere around these estimates. It is very useful to define a range or interval where the true value is most likely to lie.

## Calculate confidence interval

To calculate the confidence interval, the distribution function of the respective parameter (e.g. the mean value) in the population is required. Assuming that this distribution is normally distributed, the confidence interval for the mean is given by:

Where is the sample mean, n is the sample size and s is the sample standard deviation. Plus and minus indicate the lower and upper limits of the confidence interval respectively.

If the sample is small, the t-distribution is used instead of the normal distribution. Then the z-value is replaced by t and the formula is

## Confidence interval 95%

To calculate the confidence interval, the probability that the population mean lies within the interval must be defined. The confidence level of 95% or 99% is very often used as the probability. This probability is also called the confidence coefficient.

For the 95% confidence interval and the 99% confidence interval, the z-values are as follows:

Confidence level 95% 99%
z-Value 1.96 2.58

If a 95% confidence interval is given, you can be 95% sure that the true value of the parameter lies within that interval.

## Confidence interval for t-test

A t-test compares differences in means, e.g. you can use a t-test to test whether there is a difference in salary between men and women.

You actually want to make a statement whether there is a difference in salary in the population. Since you cannot survey the entire population, you use a sample. In this sample, there is a high probability of a difference in salary.

In order to be able to estimate approximately in which range the mean difference in the population lies, you calculate the confidence interval.

In the t-test calculator on DATAtab you can calculate the confidence interval of the mean difference.

Cite DATAtab: DATAtab Team (2023). DATAtab: Online Statistics Calculator. DATAtab e.U. Graz, Austria. URL https://datatab.net