# Log Rank Test

*Load example data*

What is the Log Rank Test? The Log Rank Test is used in survival analysis to compare the distribution of time to event in two or more independent samples.

What does "distribution" mean? What does "time to event" mean, and what is meant by two or more independent samples? Let's start with the last point, two or more independent samples.

## Log Rank Test Example

The log rank test can be used to test whether there is a difference between two or more different groups.

For example, you might want to know if there is a difference between two different materials used for a dental filling.

The next question is, what is the difference? The log rank test checks whether there is a difference in the time it takes for an event to occur.

What does "time to event" mean? The log rank test looks at a variable that has a start time and an end time when a certain event occurs.

Therefore, the log rank test takes into account the time between the start time and the event. This can be measured in days, weeks or months.

In our example, we might be interested in whether the material has an effect on the time it takes for the filling to break out again. We have a starting point, which is the time when the filling is placed. We also have an end point or event, which is the time when the filling breaks out again.

We are interested in the time between the start and the end, that is, the time between the insertion of the filling and the breaking out of the filling.

How do we compare the time it takes for the filling to break out again in each of the test subjects?

We do this using the Kaplan-Meier curve or the table used to create this graph. We plot the time on the x-axis and the survival rate on the y-axis.

What is the survival rate? The Kaplan-Meier curve tells us how likely it is that a filling will last longer than a certain amount of time.

Let's say we want to know how likely it is that a filling will last more than 5 years. In this case, the Kaplan-Meier curve tells you that there is a 70% chance that a restoration will last longer than 5 years.

But now we want to test whether there is a difference between the two materials, so we plot both curves on the graph.

The question that the log rank test answers is: Is there a significant difference between the two curves? In other words, does the filling material have an effect on the "survival time" of the filling?

## Hypotheses in the Log Rank Test

We can now move on to the null and alternative hypotheses of the log rank test.

**Null hypothesis:**Both groups have identical distribution curves.**Alternative hypothesis:**Both groups have different distribution curves.

So, as always with a statistical hypothesis test, you get a p-value at the end of the log rank test.

The question is whether or not this p-value is greater than the significance level. In most cases, the significance level is set at 0.05.

If the calculated p-value is greater than 0.05, the null hypothesis is not rejected. Based on the available data, it is then assumed that both groups have the same distribution curve.

If the p-value is less than 0.05, the null hypothesis is rejected and it is assumed that the two groups are different.

## Assumptions for the Log Rank Test

The assumptions for the log-rank test are as follows:

**Independence:** The survival times or event times of individuals in each group
should be independent to each other. This assumption implies that the occurrence of an
event (e.g., death or failure) for one individual should not influence the occurrence of
an event for another individual.

**Non-Informative Censoring:** Censoring should not be related to the event being
studied or to the group assignment (Censored and non-censored patients do not differ in
terms of their actual event times). The log-rank test assumes that the probability of
censoring should be the same for all individuals within each group. In other words,
censoring should not be related to the event being studied or to the group assignment.

**Proportional Hazards:** The hazard rates (the risk of an event occurring) for the
compared groups should be consistent over time. The ratio of the hazard rates should
remain constant, indicating that the groups are not experiencing significantly different
risks at different time points.

## Calculating the Log Rank Test

In the next step, we will discuss the formulas for the Logrank test and how to calculate it manually. Suppose we have Group 1 and Group 2 and we want to test whether the two groups have the same survival function or not.

The table above shows the times when either an event occurred or the case was censored. In this case '1' means event occurred '0' means censored.

If we look at our previous example with the fill materials, then each group would have received a different material for the fill. If we assume that time is measured in years, then for group one the first fill would have failed after 2 years, the second fill after 3 years and so on and so forth.

To calculate a log-rank test, we need to combine the tables of Group 1 and Group 2. To do this, we first write down all the time points that appear in the groups.

These are 2, 3, 4, 6, 7 and 8. It is important that the times when only cases were censored are not included in the table. At time 5 a case was censored, but otherwise 5 does not occur, so we do not include time 5 in this table.

Similar to the Kaplan Meier curve, we then fill in the columns *m*, *q* and
*n* for groups 1 and 2, respectively. *m* tells us exactly how many people had
an event at that time.

In group 1, one filling broke out after 2 years, one filling broke out after 3 years, nothing happened at time points 4 and 6, two fillings broke out at time point 7, and one filling broke out at time point 8.

*q* tells us at what time how many cases were censored. Here we only have time 5.
As we have already said, we have not entered this time in the table, so this value is
assigned to the next earliest time, which is 4, so we have a 1 in the third row. We can
do the same for the second group.

From the generated tables we can calculate the so-called expected values for each row. For Group 1 and Group 2 this is done using the following equations.

Let's take a closer look at the first row. *n1* is 6 and *n2* is also 6, so we
have 6 divided by 6 plus 6 and *m1* is 1 and *m1* is 2, so we have 1 plus 2.
This results in 1.5. We repeat this for all the rows and for both groups.

Now we need the observed values minus the expected values. For this we simply calculate
*m1* minus *e1* or *m2* minus *e2*.

Now we can calculate what is called the log rank statistic. We can use either the values from group 1 or the values from group 2. We just take the values from group 2.

*O2* minus *E2* is obtained by adding these values in the column "m2-e2",
which is 1.15. But what is the variance? The variance is given by this formula.

We first calculate the following expression for each row and then add them up. In our case we get 1.78.

We can now calculate the log rank statistic. In our example we get 0.74.

The log rank statistic is equivalent to a Chi2 value. Therefore, the critical p-value can be determined using the Chi2 distribution. The required degrees of freedom are given by the number of groups minus 1.

## Calculate Log Rank Test with DATAtab

*Load example data*

Now you are wondering what is the easiest way to calculate the Log Rank Test? This is best done online with DATAtab. The steps are:

- first you go to the statistics calculator on datatab.net
- copy your own data into the table
- click on "Plus" and click on the tab Survival Analysis

Here we have a column with the time, then a column telling us whether the event occurred or not. Here 1 stands for "occurred" and 0 for "censored". Then we have the variable "Material" with the two materials A and B.

Depending on what you select here, the appropriate methods will be calculated for you. If you select only the variable "Time", the Kaplan-Meier Survival Curve will be displayed with the corresponding table. If you do not select a variable with the status, it is assumed that no case is censored. If this is not the case, you can simply click here at "Status" on the variable that contains the data whether the event has occurred or not.

If now another factor is selected, e.g. the "Material", the log-rank test will be calculated. You can read the null and the alternative hypothesis and get the results of the Log Rank Test listed.

The null hypothesis is: There is no difference between groups A and B in terms of the distribution of time until the event occurs.

And the alternative hypothesis is: There is a difference between groups A and B in the distribution of the time until the event occurs.

Below you can read the results and you can see the p-value for the log rank test. If you don't know exactly how this is interpreted, you can simply click on Summary in words:

A log-rank test was calculated to find out if there is a difference between groups A and B in terms of the distribution of time until the event occurs.

For the data at hand, the log-rank test showed that there is a difference between the groups in terms of the distribution from the time until the event occurs, p=<0.001. The null hypothesis is thus rejected.

This means that if the p-value is greater than the pre-determined significance level, which in most cases is 5%, the null hypothesis is not rejected, i.e. there is then no significant difference.

If the p-value is smaller, the null hypothesis is rejected and it is assumed on the basis of the available data that there is a difference between the curves.

### Statistics made easy

- many illustrative examples
- ideal for exams and theses
- statistics made easy on 412 pages
- 5rd revised edition (April 2024)
**Only 8.99 €**

*"Super simple written"*

*"It could not be simpler"*

*"So many helpful examples"*