# Causality

Causality means that there is a clear cause-effect relationship between two variables. Therefore, there is causation, when action A causes outcome B. A common mistake in the interpretation of statistics is to infer causality when correlation is present, but correlation is simply a relationship.

## Causality and correlation

Correlation analysis shows whether there is a relationship between two variables. If there is a correlation, however, it is not yet known in which direction this relationship goes. For this, it must first be checked whether causality exists.

## Why is correlation not causality?

If there is a correlation between variable X and variable Y, this does not mean that the two variables are causally related. It could be, for example, that the correlation is purely due to a third variable Z and neither the variable X has an influence on Y nor the variable Y on X.

## Causality and regression

If there is a causal relationship between two variables, a regression analysis can predict one variable with the other. Of course, care must be taken that the direction is correct. It is only possible to predict the dependent variable with the help of the independent variable with a regression.

By defining one variable as predictor and one variable as criterion in regression, the causal direction is already given, this direction should then be justified based on theory.

Therefore, causality or direction of effect must first be theoretically derived before it can be assumed in a regression model. Thus, one cannot "search" for causality with the regression, the regression can only be used if a causal relationship is assumed.

By the way, you can easily calculate a regression analysis online and a correlation analysis online with DATAtab.

## Causal Models for Regression

Does linear regression imply causation? Neither correlation nor regression can indicate causation. Causal model involve regression or correlation analysis and additionally a strong theory linking the two or more variables.

## Assumptions for causality

There are two prerequisites for causality. On the one hand, there must be a significant relationship, i.e., a significant correlation. On the other hand, the second condition can be fulfilled in two ways: either there is a temporal sequence of the variables (1) (e.g. the variable A was collected before variable B in terms of time), or there is a theoretically justified and plausible theory about in which direction the causal relationship goes (2).

If the second condition does not apply by neither of the possible options (1) and (2), i.e., there is neither a temporal order nor can causality be substantiated by a well-founded theory, then one can only speak of a correlation, but never of causality. I.e., it cannot be said that variable A influences variable B or vice versa.

## Example of causality

Let's say the research question is: Is there a causal relationship between the age at which a child speaks their first sentences and later school success?

First, you need to check if there is a correlation between the two variables, this is done with a correlation analysis. If there is a significant correlation, the second condition must still be tested.

The second condition can be confirmed either by theory or if there is a time sequence. In this case, there is a clear time sequence. If there is a correlation, it is clear that the variable "age at which the first sentence is spoken" influences the variable "later school success", the other way around is not possible.

Cite DATAtab: DATAtab Team (2024). DATAtab: Online Statistics Calculator. DATAtab e.U. Graz, Austria. URL https://datatab.net