Regression is a statistical method that allows modeling relationships between a dependent variable and one or more independent variables.
A regression analysis makes it possible to infer or predict another variable on the basis of one or more variables.
For example, you might be interested in what influences a person's salary. In order to find it out, you could take the highest level of education, the weekly working hours and the age of a person.
Further you could now investigate whether these three variables have an influence on a person's salary. If so, you can predict a person's salary by using the highest education level, the weekly working hours and the age of a person.
What are dependent and independent variables?
The variable to be inferred is called the dependent variable (criterion). The variables used for prediction are called independent variables (predictors).
Thus, in the example above, salary is the dependent variable and highest educational attainment, weekly hours worked, and age are the independent variables.
When do I use a regression analysis?
By performing a regression analysis two goals can be pursued. On the one hand, the influence of one or more variables on another variable can be measured, and on the other hand, the regression can be used to predict a variable by one or more other variables.
1) Measurement of the influence of one or more variables on another variable
- What influences children's ability to concentrate?
- Do the educational level of the parents and the place of residence affect the future educational attainments of children?
2) Prediction of a variable by one or more other variables
- How long does a patient stay in the hospital?
- What product is a person most likely to buy from an online store?
The regression analysis thus provides information about how the value of the dependent variable changes if one of the independent variables is changed.
Types of regression analysis
Regression analyses are divided into simple linear regression, multiple linear regression and logistic regression. The type of regression analysis that should be used, depends on the number of independent variables and the scale of measurement of the dependent variable.
of independent variables
Scale of measurement
Scale of measurement
|Simple linear Regression||one||metric||metric, ordinal, nominal|
|Multiple lineare Regression||multiple||metric||metric, ordinal, nominal|
|Logistic Regression||multiple||ordinal, nominal||metric, ordinal, nominal|
If you only want to use one variable for prediction, a simple regression is used. If you use more than one variable, you need to perform a multiple regression. If the dependent variable is nominally scaled, a logistic regression must be calculated. If the dependet variable is metrically scaled, a linear regression is used. Whether a linear or a non-linear regression is used depends on the relationship itself. In order to perform a linear regression, a linear relationship between the independent variables and the dependent variable is necessary.
Independent variable of the regression
No matter which regression is calculated, the scale level of the independent variables can take any form (metric, ordinal and nominal). However, if there is an ordinal or nominal variable with more than two values, so-called dummy variables must be formed.
Correlation and causality in regression analysis
In the case of linear regression, the independent variable can be used to predict the dependent variable if there is a correlation between the two variables. However, what is important to note is that a correlation between two variables does not necessarily mean causality. So what does this mean? If high values of one variable are accompanied by high values of the other variable, it does not mean that values on one variable will increase because values on the other variable will increase.
Examples of a regression
Simple linear regression
Does the weekly working time have an influence on the hourly wage of employees?
Multiple lineare regression
Do the weekly working time and the age of employees have an influence on their hourly wage?
Do the weekly working time and the age of employees have an influence on the probability that they are at risk of burnout?
- Dependent variable
- Independent variables
Only three simple steps are necessary and the regression calculator will give you all important key figures:
- 1. Copy your data into the table of the statistics calculator
- 2. Click on "Regression"
- 3. Select a dependent variable and one or more independent variables
If one of the independent variables has a categorical level of measurement (ordinal or nominal), dummy variables are automatically generated and a reference category is defined. As soon as a series contains only numbers, the statistics calculator automatically defines it as a metric variable.
Statistics made easy
- Many illustrative examples
- Ideal for exams and theses
- Statistics made easy on 251 pages
- Only 6.99 €
"Super simple written"
"It could not be simpler"
"So many helpful examples"