When to Use the Poisson Distribution (and When Not To)

You use a Poisson distribution when you’re counting how many times something happens in a fixed window of time, space, or another unit of observation, and those events occur independently at a roughly steady average rate. It’s one of the most practical tools in statistics for modeling things like customer arrivals, equipment failures, or rare disease cases, but it only works when a specific set of conditions holds true.

The Three Core Conditions

A Poisson distribution fits your data when three things are true about the events you’re counting:

Independence: The number of events in one interval has no effect on the number in any other non-overlapping interval. One customer walking into a store doesn’t make the next customer more or less likely to arrive.
Constant rate: The probability of an event depends only on the length (or size) of the interval, not on when or where the interval falls. If you’re counting emails per hour, the average rate shouldn’t shift dramatically from one hour to the next.
No simultaneous events: Two events can’t happen at exactly the same instant. Each occurrence is a single, discrete event.

When these hold, the number of events in any interval follows a Poisson distribution with a parameter often written as λ (lambda). Lambda equals the average rate multiplied by the size of the interval. If a call center averages 5 calls per minute and you’re looking at a 10-minute window, λ = 50.

The Mean Equals the Variance

The Poisson distribution has an unusual property: its mean and variance are the same number, both equal to λ. This is actually one of the quickest ways to check whether a Poisson model fits your data. If you calculate the average number of events per interval and the variance across intervals, the two should be close. When the variance is noticeably larger than the mean, your data is “overdispersed,” and the Poisson model will underestimate how spread out the counts really are.

That matters because underestimating spread leads to artificially narrow confidence intervals. You end up flagging patterns as statistically significant when they’re really just noise. If you spot overdispersion, a negative binomial distribution is the standard alternative. It adds a second parameter that captures the extra variability the Poisson can’t account for. This is especially common with recurrent events, where the same person or unit experiences repeated occurrences, since that tends to create clusters that violate the independence assumption.

Classic Real-World Examples

The Poisson distribution shows up across a surprisingly wide range of fields. Some of the most common applications:

Call centers: Counting how many calls arrive in a 15-minute block. Research on banking call centers has confirmed that arrivals often fit a Poisson process well, though the rate may shift throughout the day.
Emergency departments: Counting patient arrivals per hour. Hospital data generally follows a Poisson pattern, with the caveat that mass-casualty events (like car accidents) create bursts that violate the independence assumption.
Manufacturing: Counting defects per unit of product, or equipment failures per month.
Biology and medicine: Counting mutations per DNA strand, bacterial colonies per petri dish, or rare disease cases per 100,000 people.
Traffic: Counting vehicles passing a sensor per minute on a free-flowing highway.
Insurance: Counting claims filed per policy period.

Notice the common thread: each example involves counting discrete events within a defined boundary. That boundary can be time (calls per hour), space (defects per square meter), or some other unit (accidents per 1,000 flights). The Poisson works in all these dimensions as long as the three core conditions hold.

When Events Happen in Space, Not Just Time

Most introductions to the Poisson focus on time intervals, but it applies equally well to spatial data. Ecologists use it to model the number of plants in a randomly placed quadrat. Astronomers use it for star counts in a patch of sky. Epidemiologists use it for disease cases within a geographic region.

The logic is identical: divide your area (or volume) into regions, count the events in each region, and check whether those counts are independent and occur at a constant average density. If so, the count in any region follows a Poisson distribution where λ equals the average density multiplied by the region’s area. A process with a uniform density across the entire study area is called a homogeneous Poisson process. When the density varies by location, it’s a nonhomogeneous process, which requires more advanced modeling but still builds on the same Poisson foundation.

Its Connection to Other Distributions

The Poisson distribution doesn’t exist in isolation. It connects directly to two other distributions you may encounter.

Poisson as a Binomial Shortcut

When you have a very large number of trials and a very small probability of success on each trial, the Poisson distribution serves as a convenient approximation to the binomial. The standard rule of thumb: if you have at least 100 trials and the expected number of successes (n × p) is 10 or fewer, you can use a Poisson distribution with λ = n × p and get accurate results with much simpler math. Think of quality control where you inspect 10,000 items and the defect rate is 0.05%. That’s a perfect case for the Poisson approximation.

Poisson Counts and Exponential Wait Times

If events follow a Poisson process, the time between consecutive events follows an exponential distribution. These are two sides of the same coin. The Poisson tells you how many events happen in a fixed time window. The exponential tells you how long you’ll wait until the next event. If a server averages 3 requests per second (Poisson with λ = 3), the gap between requests averages one-third of a second (exponential with rate 3). This pairing is useful when you need to model both the count and the timing of events.

When Not to Use It

Recognizing when the Poisson doesn’t fit is just as important as knowing when it does. Several common scenarios violate its assumptions:

Clustered arrivals: Restaurant customers arriving in groups, hospital admissions after a disaster, or overflow from another system all create bursts that break the independence assumption.
Scheduled events: Appointments at a doctor’s office or regulated airplane landings at airports have enforced spacing. These arrival patterns are less variable than Poisson would predict.
Changing rates: Website traffic that spikes after a marketing campaign or seasonal disease patterns violate the constant-rate assumption. You can sometimes handle this with a nonhomogeneous Poisson model that allows the rate to vary over time, but the basic Poisson with a single fixed λ won’t work.
Overdispersion: When the variance in your data substantially exceeds the mean, a Poisson model is too tight. This is common with recurrent phenomena where certain individuals or locations are inherently more event-prone than others.

Testing Whether Your Data Fits

If you have data and want to verify that it actually follows a Poisson distribution, the chi-squared goodness-of-fit test is the standard approach. You group your observed counts into categories (0 events, 1 event, 2 events, and so on), calculate the expected frequency for each category using the Poisson formula with λ estimated from your sample mean, then compare observed and expected frequencies using the chi-squared statistic.

One practical detail: the test requires that expected frequencies in each category aren’t too small, typically at least 3 to 5. If you have categories with very few expected observations (which often happens in the tail, where high counts are rare), combine adjacent categories until the expected frequency is large enough. The degrees of freedom for the test equal the number of final categories minus 2: one subtracted for the constraint that frequencies must sum to your sample size, and one more because you estimated λ from the data.

Before running a formal test, a quick visual check is often enough. Plot a histogram of your counts and overlay the Poisson probabilities using your sample mean as λ. If the fit looks reasonable and the variance is close to the mean, you’re likely in good shape.