Log Rank Test
What is the Log Rank Test? The Log Rank Test is used in survival time analysis and compares the distribution of time to event occurrence of two or more independent samples.
Now we will discuss each of the items step by step. What does "distribution" mean, what does "time until an event occurs" mean, and what is meant by two or more independent samples? Let's start with the last point, the topic of two or more independent samples.
Log Rank Test Example
With the Log Rank Test you can check if there is a difference between two or more different groups.
For example, you might be interested in whether there is a difference between two different materials used for a dental filling.
The next question is, in relation to what is there a difference? The Log Rank Test checks if there is a difference in the time until an event occurs.
What does "time until an event occurs" mean? The Log Rank Test looks at a variable that has a start time and an end time when a certain event occurs.
The time between the start time and the event is considered in the Log Rank Test. This can be measured in days, weeks or months.
In our example, we might be interested in whether the material has an influence on the time after which the tooth filling breaks out again.
In this example, we have a starting point, which is the moment when the filling is placed. We also have an end point or event, which is the time when the filling breaks out again.
We are interested in the time between the start and the end, that is, the time between the insertion of the filling and the breaking out of the filling.
How do we compare the respective time until the filling breaks out again in the test subjects?
We do this with the help of the Kaplan Meier curve or with the table that is used to create this graph. Here, time is plotted on the x-axis and survival rate is plotted on the y-axis.
What is the survival rate? The Kaplan Meier curve tells us how likely a filling is to last longer than a certain amount of time.
Let's say you want to know how likely it is that a filling will last longer than 5 years. In this case, you can read from the Kaplan Meier curve that it is 70% likely that a filling will last longer than 5 years.
But now we want to check if there is a difference between two materials, so we enter both curves in the graph.
The question that the Log Rank Test now answers is: Is there a significant difference between the two curves? Or to put it another way: does the filling material have an influence on the "survival time" of the dental filling?
Hypotheses in the Log Rank Test
Now we can come to the null and alternative hypothesis of the Log Rank Test.
- Null hypothesis: Both groups have identical distribution curves.
- Alternative hypothesis: Both groups have different distribution curves. Log Rank Test Hypotheses.
So, as always with a statistical hypothesis test, you get a p-value at the end of the log rank test.
The question is whether this p-value is greater than the significance level or not. The significance level is set to 0.05 in most cases.
If the calculated p-value is greater than 0.05, the null hypothesis is retained. Based on the available data, it is then assumed that both groups have the same distribution curve.
If the p-value is less than 0.05, the null hypothesis is rejected.
Calculating the Log Rank Test
In the next step, we will now discuss the formulas of the Log Rank Test and how it is calculated by hand.
Suppose we have group 1 and group 2 and we now want to test whether both groups have the same survival time function or not.
Here we see the times when either an event occurred or the respective case was censored. In this case "1" means event occurred "0" means censored.
If we look at our previous example with the fill materials, then the groups would have each received different material for the fill. If we assume that time is measured in years, then for group one the first fill would have failed after two years, the second fill after 3 years and so on and so forth.
In order for us to calculate a log rank test, we need to combine both tables. To do this, we first note down all the time points that occur in the groups.
This is 2, 3, 4, 6, 7 and 8. It is important that the times at which only cases were censored are not included in the table. At time 5, one case was censored, but otherwise 5 does not occur, so we do not include time 5 in this table.
Similar to the Kaplan Meier curve, we then fill in the columns m, q and n for groups 1 and 2, respectively. m tells us exactly how many people had an event at this time.
For group 1, after two years 1 filling has broken out, after 3 years again one filling, at time 4 and 6 nothing has happened, at time 7 two fillings have broken out and at time 8 one.
q tells us at which time point how many cases were censored. Here we have only the time 5. Since we, as already said, did not take over this time also into the table, this value is assigned to the next earliest time, thus with 4, therefore a 1 stands here.
We can do the same for the second group. From the now generated tables we can now calculate the so-called expected values. For group 1 this is done with this formula and for group 2 with this formula.
Let's take a closer look at the first case. n1 is 6 and n2 is also 6, so we have 6 divided by 6 plus 6 and m1 is 1 and m1 is 2, so we have 1 plus 2. This results in 1.5. We repeat this for all rows and for both groups.
Now we need the observed values minus the expected values. For this we simply calculate m1 minus e1 or m2 minus e2.
We get these values for each group.
Now we can calculate the so-called log rank statistic. For this we can use either the values of group 1 or of group 2. Now we just take the values of group 2.
O2 minus E2 is obtained by adding up these values, which is 1.15. But what is the variance of this? The variance is given by this formula.
So we first calculate for each row this expression here, we then sum that up. In our case we get 1.78.
With this we can now calculate the log rank statistic. In our example, we get 0.74.
The Log Rank statistic corresponds to a Chi2 value. Therefore, the critical p-value can be determined using the Chi2 distribution. The required degrees of freedom result from the number of groups minus 1.
Calculate Log Rank Test with DATAtab
Now you are wondering what is the easiest way to calculate the Log Rank Test? This is best done online with DATAtab. The steps are: First you go to the (1) statistics calculator on datatab.de and you (2) copy your own data into the table. Then (3) you click on "Plus" and click on the (4) tab Survival Analysis.
Here we have a column with the time, then a column that tells us whether the event occurred or not. Here 1 stands for "occurred" and 0 for "censored". Then we have the variable Material with the two materials A and B.
Depending on what you select here, the appropriate methods will be calculated for you.
If you select only the variable Time, the Kaplan-Meier Survival Curve will be displayed with the corresponding table. If you do not select a variable with the status, it is assumed that no case is censored. If this is not the case, you can simply click here at "Status" on the variable that contains the data whether the event has occurred or not.
If now another factor is selected, e.g. the material, the log-rank test is calculated in this case. You can read the null and the alternative hypothesis and get the results of the Log Rank Test listed.
The null hypothesis is: There is no difference between groups A and B in terms of the distribution of time until the event occurs.
And the alternative hypothesis is: There is a difference between groups A and B in the distribution of the time until the event occurs.
Here below you can read the results and you can see the p-value for the log rank test.
If you don't know exactly how this is interpreted, you can simply click on Summary in words:
A log-rank test was calculated to find out if there is a difference between groups A and B in terms of the distribution of time until the event occurs.
For the data at hand, the log-rank test showed that there is a difference between the groups in terms of the distribution from the time until the event occurs, p=<0.001. The null hypothesis is thus rejected.
This means that if the p-value is greater than the pre-determined significance level, which in most cases is 5%, the null hypothesis is retained, i.e. there is then no significant difference.
If the p-value is smaller, the null hypothesis is rejected and it is assumed on the basis of the available data that there is a difference between the curves.
Statistics made easy
- Many illustrative examples
- Ideal for exams and theses
- Statistics made easy on 251 pages
- Only 6.99 €
"Super simple written"
"It could not be simpler"
"So many helpful examples"