Cox Regression (Cox Proportional Hazards Survival Regression)
What is Cox Proportional Hazards Survival Regression or Cox Regression for short? Cox regression is used in survival time analysis to determine the influence of different variables on survival time.
To determine this influence, the Cox proportional hazards model is then used. So what exactly does survival time analysis mean?
In survival time analysis, the survival times of test subjects are recorded and a survival curve is generated. As a rule, the test subjects have a certain disease.
The survival curve then shows how many of the subjects remain alive over time.
The considered time does not have to have anything to do with the actual "survival time", nevertheless one speaks of the Survival Time and Survival Time Analysis.
In general terms, survival time analysis considers a variable that has a start time and an end time when a certain event occurs.

The time between the start time and the event is considered in the survival time analysis. This can be measured in days, weeks or months, for example.
There is now the problem that an investigation cannot last indefinitely. This results from limited time and financial resources and from the fact that one would like to publish the results at some point. Therefore, each study has a start date and an end date. If there is no clear event date for a case, it is referred to as "censoring".

Several methods have been developed to deal with this issue. You are welcome to have a look at the tutorial on the Kaplan Meier curve.
Cox Regression Example
Let's go back to the Cox regression. For example, if you want to analyze the survival time after the detection of a disease, you are often not interested in the survival time itself, but in what has an influence on the survival time.
So we want to know if survival time depends on one or more factors, called "predictors."
For simple situations with a single factor with only two values, the Log Rank Test is used. For example, if you want to test whether there is a difference in survival time when two different drugs are given.
If you want to include the age of the subjects, a special type of regression is needed. This is the Proportional Hazards Survival Regression.
This regression is then used to evaluate the effects of the individual predictors on the shape of the survival curve.

In our example, we have as predictors, on the one hand, the drug used and, on the other hand, the age of the subjects. We would now like to know what influence these variables have on the survival time curve. For this purpose, we resort to the Cox regression.
We will now look at the individual steps of the Cox regression using an example. Let's assume that we have the following data and we want to evaluate them.

Each row describes a patient with the corresponding disease. The time indicates when the event or death occurred. Of course, we also have the information about what drug was used and the age of the subjects.
Calculate Cox Regression
In the first step we now need to calculate the Cox Regression, we do this now online using DATAtab, then we go through how to interpret the results.
To calculate the Cox Proportional Hazards Survival Regression, we simply go to the Cox Regression Calculator and copy our data into this table, simply using "copy and paste" as in Excel.
Now we click on "Survival Analysis." Depending on which variables you want to select, different methods of survival analysis will be calculated. If you select only the "Time" and the "Status", the Kaplan Meier curve will be displayed.
If you now click on the drug, you will get the log rank test. If you also select the age, the Cox regression will be calculated.

Interpreting Cox Regression
Now you get the result, let's have a closer look at it.
In the first column are the names of the variables. The first row shows the variable drug and the second row shows the age of the persons.

The most important values of this table are now the estimated regression coefficient and the p-value. Using the p-value, you can read whether the regression coefficient is significantly different from zero.
So the null hypothesis is that in the population the coefficient is zero. Assuming, as usual, that the significance level is set at 5%, the null hypothesis is rejected for p-values less than 5 or 0.05. Thus, the coefficient is significantly different from zero.
In the case of drug, the p-value is less than 0.05 and thus there is a significant difference from zero.
In the case of age, we obtain a p-value of 0.221, which is thus greater than 0.05. Therefore, in this case, the null hypothesis is not rejected or retained and we assume, based on these data, that age has no significant effect on the survival curve.
Statistics made easy
- Many illustrative examples
- Ideal for exams and theses
- Statistics made easy on 251 pages
- Only 6.99 €

"Super simple written"
"It could not be simpler"
"So many helpful examples"