How to Calculate Hazard Ratio in Survival Analysis

A hazard ratio is calculated by dividing the hazard rate (the instantaneous risk of an event) in one group by the hazard rate in another. In its simplest form: HR = hazard rate in the treatment group ÷ hazard rate in the control group. An HR of 1.0 means no difference between groups, below 1.0 means lower risk in the treatment group, and above 1.0 means higher risk. In practice, most researchers calculate hazard ratios using either a log-rank approach from Kaplan-Meier survival data or a Cox proportional hazards regression model.

What a Hazard Ratio Actually Measures

The word “hazard” here doesn’t mean danger in the everyday sense. It refers to an instantaneous rate: the chance of an event (death, disease recurrence, recovery) happening at a specific moment in time, given that the person has survived up to that moment. This is different from cumulative risk, which adds up over a whole study period. The hazard is small over any short time interval but has a significant cumulative effect, essentially describing an event rate at each point in time.

Because of this, the hazard ratio compares how quickly events are happening in two groups at any given instant rather than comparing the total proportion of events at the end of a study. An HR of 0.60 means the treatment group experiences events at 60% the rate of the control group at any point during follow-up. Another useful way to think about it: the hazard ratio is equivalent to the odds that a randomly selected person from the higher-hazard group will experience the event before a randomly selected person from the lower-hazard group.

The Log-Rank Method: Step by Step

If you already have Kaplan-Meier survival data for two groups, the log-rank approach is the most straightforward way to get a hazard ratio. Here’s how it works:

Step 1: From your Kaplan-Meier calculations, count the observed number of events (O) in each group. Call these O_a and O_b.
Step 2: Calculate the expected number of events (E) in each group under the assumption that there’s no difference in survival. These expected values, E_a and E_b, come from the log-rank test’s calculations, which distribute expected events proportionally based on how many people in each group were at risk at each time point.
Step 3: Compute the hazard ratio: HR = (O_a / E_a) ÷ (O_b / E_b)

The ratio O/E for each group tells you whether that group had more or fewer events than expected. Dividing one by the other gives you the hazard ratio.

Adding a Confidence Interval

A single HR number isn’t very useful without knowing how precise it is. To calculate a 95% confidence interval:

Step 1: Take the natural logarithm of your hazard ratio: L = ln(HR)
Step 2: Calculate the standard error of L: SE = √(1/E_a + 1/E_b)
Step 3: Lower bound = e^{(L − 1.96 × SE)}
Step 4: Upper bound = e^{(L + 1.96 × SE)}

If that confidence interval contains 1.0, the difference between your two groups is not statistically significant at the 5% level. If the entire interval falls below 1.0, the treatment group has a significantly lower event rate. If the entire interval is above 1.0, the treatment group has a significantly higher event rate.

The Cox Regression Method

The Cox proportional hazards model is the most widely used approach in published research because it can account for multiple variables at once. Rather than just comparing two groups, it lets you adjust for age, sex, disease stage, and other factors that might confuse the comparison.

The Cox model produces a coefficient (usually labeled β) for each variable. The hazard ratio for that variable is simply e^β. So if a Cox model gives you a coefficient of −0.522 for a treatment variable, the hazard ratio is e^−0.522 = 0.59, meaning the treatment group’s event rate is about 59% of the control group’s rate at any given time.

You don’t typically calculate a Cox model by hand. Statistical software does the heavy lifting. In R, the standard approach uses the survival package. The function coxph(Surv(time, status) ~ treatment + age + stage, data = mydata) fits the model, and the output includes each coefficient, its hazard ratio (labeled “exp(coef)”), standard error, and p-value in a single summary table. The predict function can then return the risk score for each individual, which equals e raised to the linear predictor.

A Worked Example

Suppose you’re running a trial of a wound-healing treatment. In the treatment group, you observe 30 healing events where 40 were expected under the null hypothesis. In the control group, you observe 50 events where 40 were expected.

HR = (30/40) ÷ (50/40) = 0.75 ÷ 1.25 = 0.60

This means the treatment group’s rate of healing at any given time is 60% that of the control group. Wait, that sounds like the treatment is worse. And it is, if the event you’re measuring is something desirable like healing. An HR below 1.0 means fewer events in the treatment group. Whether that’s good or bad depends entirely on what the event is. For death or disease recurrence, HR below 1.0 is favorable. For healing or recovery, HR above 1.0 is favorable.

You can also convert the hazard ratio into a probability. If the HR is 0.60, the probability that a randomly chosen person from the treatment group heals before a randomly chosen person from the control group is: P = HR ÷ (1 + HR) = 0.60 ÷ 1.60 = 0.375, or about 37.5%.

For the 95% confidence interval: L = ln(0.60) = −0.511. SE = √(1/40 + 1/40) = √0.05 = 0.224. Lower bound = e^{(−0.511 − 1.96 × 0.224)} = e^−0.950 = 0.387. Upper bound = e^{(−0.511 + 1.96 × 0.224)} = e^−0.072 = 0.931. Since the entire interval (0.387 to 0.931) is below 1.0, the difference is statistically significant.

The Proportional Hazards Assumption

Every hazard ratio calculation assumes that the ratio between the two groups stays constant over time. This is called the proportional hazards assumption. If one treatment works well early but loses its advantage after six months, a single hazard ratio won’t capture that reality. It will average out the changing relationship and potentially mislead you.

Three common ways to check this assumption exist. The simplest is visual: plot the survival curves for both groups and see if they diverge steadily or if they cross or converge. A more formal approach uses log-minus-log plots, where proportional hazards should produce roughly parallel lines. The most rigorous method tests something called Schoenfeld residuals, which checks whether the effect of each variable in your model changes over time. Research reviewing orthopedic studies found that log-minus-log plots were the most commonly used check, followed by Schoenfeld residual tests.

When the assumption fails, the hazard ratio can be genuinely misleading. This happens more often than many researchers acknowledge. Populations that include people with different susceptibilities to the event, or treatments whose effects change over time, will almost always violate proportional hazards to some degree. In those situations, the HR can deviate substantially from what you’d see on the actual survival or risk scale.

Why HR Doesn’t Equal Risk Reduction

One of the most common misinterpretations is treating a hazard ratio as if it were a straightforward risk reduction. If the HR is 0.60, it’s tempting to say “the treatment reduces the risk of death by 40%.” But the relative reduction in the actual risk of death up to any given time point is always less than what the hazard ratio implies. This has been demonstrated mathematically for any survival distribution.

The reason is that the hazard ratio describes a reduction in the instantaneous rate of events, not in the cumulative chance of experiencing the event. These two things diverge more and more as time passes. So when you see a reported HR of 0.60, the true relative risk reduction at any specific follow-up point will be smaller than 40%. The HR captures a reduction in the rate of events, not the probability of ultimately avoiding the event altogether.

How Hazard Ratios Are Reported

Under the CONSORT 2025 guidelines, which govern how randomized trials are published, every primary and secondary outcome must include an estimated effect size with a measure of precision, typically a 95% confidence interval. For survival data, the hazard ratio is one of the standard effect measures, alongside differences in median survival time. Researchers must also report the number of participants in each analysis group and the number with available data at each time point.

When you’re reading a published study, look for the HR, its confidence interval, and the p-value together. A small p-value combined with a confidence interval that doesn’t cross 1.0 indicates a statistically significant difference. But be aware of a technical quirk: the confidence interval from a Cox model (called a Wald interval) can occasionally include 1.0 even when the log-rank test’s p-value is below 0.05. This discrepancy comes from slightly different mathematical approaches. When it happens, the log-rank-based confidence interval is generally considered more reliable.