When to Use Negative Binomial Regression for Count Data

Negative binomial regression is the right choice when your outcome variable is a count (0, 1, 2, 3…) and the variance in your data exceeds the mean. That combination, count data with overdispersion, is the core scenario this model was designed for. If you’re trying to decide between Poisson regression, negative binomial regression, or something else entirely, the answer almost always comes down to what your variance looks like relative to your mean.

Count Data That Doesn’t Fit Poisson

Poisson regression is the starting point for modeling count outcomes, but it makes a strict assumption: the conditional mean and the conditional variance of your outcome are equal. In practice, real-world count data rarely behaves this neatly. The variance almost always exceeds the mean, a property called overdispersion. When that happens, Poisson regression underestimates the standard errors of your coefficients, which inflates your test statistics and can make predictors look statistically significant when they aren’t.

Negative binomial regression solves this by adding a dispersion parameter that lets the variance differ from the mean. It uses the same basic structure as Poisson regression (a log link function, the same form of model equation) but relaxes that restrictive equal-variance assumption. You can think of it as a more flexible version of Poisson: in fact, the Poisson model is technically nested inside the negative binomial model, with the dispersion parameter constrained to zero.

How to Check for Overdispersion

The simplest first step is to look at the conditional means and variances of your outcome variable, broken down by the levels of your predictors. If the variances are consistently larger than the means within each group, overdispersion is likely present and negative binomial regression is a better fit than Poisson.

For a formal test, the likelihood ratio test compares your Poisson model against a negative binomial model. If the test is significant, the dispersion parameter is meaningfully different from zero and the negative binomial model fits better. In one commonly cited example from UCLA’s statistical consulting group, this test produced a chi-squared value of 926 with one degree of freedom, strongly confirming that the negative binomial model was appropriate. You can also use a Wald test or a score test. The score test has a practical advantage: it evaluates whether the more complex model is needed without requiring you to actually fit it first.

Why Not Just Log-Transform and Use Linear Regression?

A common workaround is to take the log of the count variable and run ordinary least squares (OLS) regression. This creates two problems. First, the log of zero is undefined, so you either lose every observation where the count is zero or have to add an arbitrary constant, both of which distort your results. Second, OLS on a log-transformed count still can’t properly model the dispersion in your data. Negative binomial regression handles zeros naturally and models the variance structure directly.

Real-World Examples

Negative binomial regression shows up across nearly every field that deals with count outcomes. In public health, researchers use it to model the number of days absent from school, hospital visits, or disease cases in a region. In ecology, it’s the standard choice for species abundance data, such as the number of mosquitoes caught in traps across different landscapes and habitats. A 2023 study in Zhejiang Province, China, used negative binomial regression to model counts of several mosquito species captured by trapping lights, with landscape type, habitat, and month as predictors. The data were overdispersed (standard deviation greater than the mean), making Poisson regression inappropriate.

Other common applications include the number of insurance claims filed per customer, traffic accidents at an intersection over a year, defects found during manufacturing inspections, and the number of times patients use emergency services. Any time your outcome is “how many times did something happen” and the data show more variability than a Poisson distribution would predict, negative binomial regression is typically the right tool.

Interpreting the Results

The raw coefficients from a negative binomial model represent changes in the log of the expected count. That’s not intuitive, so most analysts convert them to incidence rate ratios (IRRs) by exponentiating the coefficients. An IRR tells you the multiplicative change in the expected count for a one-unit increase in a predictor, holding everything else constant.

For example, if a predictor has an IRR of 1.25, each one-unit increase in that predictor is associated with a 25% increase in the expected count. An IRR of 0.80 means a 20% decrease. The math behind this is straightforward: the coefficient represents the difference between two log-counts, and the difference of two logs equals the log of their ratio. Exponentiating gives you the ratio itself. The word “rate” applies because count outcomes are inherently rates: a number of events over some period of time or unit of exposure.

Negative Binomial vs. Quasi-Poisson

Quasi-Poisson regression is the other common option for overdispersed count data. Both models have the same number of parameters, and both can handle variance that exceeds the mean. The key difference is in how they model the variance. In a quasi-Poisson model, variance is a linear function of the mean. In a negative binomial model, variance is a quadratic function of the mean. This means the two approaches weight large and small counts differently.

In practice, the negative binomial model tends to perform better when the overdispersion increases with larger counts, which is common in biological and health data. Quasi-Poisson may be preferable when the overdispersion is more uniform across the range of counts. If you’re unsure, fitting both and comparing the residual patterns is a reasonable approach.

When You Have Too Many Zeros

Sometimes overdispersion is driven by an unusually large number of zeros in your data, more than a standard negative binomial distribution would predict. This happens when zeros come from two distinct processes. A classic example: you survey visitors to a state park about how many fish they caught. Some visitors didn’t fish at all (they were always going to catch zero), while others fished but still caught nothing. These are two different kinds of zeros.

A zero-inflated negative binomial (ZINB) model handles this by combining two components: a logistic model that predicts whether an observation falls into the “always zero” group, and a negative binomial model for the counts among the remaining observations. You’d choose ZINB over standard negative binomial when your data has excess zeros and you have theoretical reason to believe those zeros arise from a separate process. If your zeros are just part of the natural count distribution and not unusually frequent, a standard negative binomial model is sufficient.

Quick Decision Framework

Outcome is continuous, not counts: use linear regression or another appropriate model, not negative binomial.
Outcome is counts, variance roughly equals the mean: Poisson regression is fine.
Outcome is counts, variance exceeds the mean: negative binomial regression.
Outcome is counts, variance exceeds the mean, and there are excess zeros from a separate process: zero-inflated negative binomial.

The likelihood ratio test gives you a formal way to move between Poisson and negative binomial. If the dispersion parameter is significantly different from zero, go with the negative binomial. If not, Poisson is adequate and more parsimonious.