What Is a Discrete Distribution? Types and Examples

A discrete distribution is a probability distribution where the possible outcomes are distinct, countable values, like 0, 1, 2, 3, and so on. Think of it as a way to map out how likely each specific outcome is when the thing you’re measuring can only land on separate, individual numbers. Rolling a die, counting how many emails you get in an hour, or tallying the number of heads in ten coin flips all produce this kind of data.

How a Discrete Distribution Works

The core idea is simple: you list every possible outcome and assign a probability to each one. This list of outcomes and their probabilities is called a probability mass function, or PMF. A PMF has two rules. First, every probability must be zero or positive (you can’t have a negative chance of something happening). Second, all the probabilities must add up to exactly 1, because one of the outcomes has to occur.

Imagine rolling a fair six-sided die. Each face has a probability of 1/6, and those six values sum to 1. That’s the entire distribution. You can look at it and immediately answer questions like “what’s the chance of rolling a 4 or lower?” by simply adding up the individual probabilities: 4/6, or about 67%.

Discrete vs. Continuous Distributions

The distinction comes down to what kind of values your variable can take. A discrete distribution handles outcomes you can list and count: number of children in a family, defective parts in a shipment, goals scored in a soccer match. A continuous distribution handles measurements that can fall anywhere along a range, like height, temperature, or time. You could measure someone’s height as 170.2 cm, 170.23 cm, or 170.2347 cm, with infinite precision.

This difference changes the math in a practical way. With discrete distributions, you can ask “what’s the probability of getting exactly 3?” and get a meaningful answer. With continuous distributions, the probability of any single exact value is technically zero. Instead, you calculate the probability of falling within a range, like “between 2.5 and 3.5.” Discrete distributions use summation (adding up individual probabilities), while continuous distributions use integration (calculating area under a curve). If you’ve ever seen a bar chart next to a smooth bell curve, that’s the visual difference.

The Bernoulli and Binomial Distributions

The simplest discrete distribution is the Bernoulli distribution. It covers a single event with two outcomes: success or failure, yes or no, heads or tails. You define the probability of success as p, and the probability of failure is just 1 minus p. Flip a coin once and you have a Bernoulli distribution with p = 0.5.

Repeat that same kind of event multiple times and you get the binomial distribution. It counts the number of successes across a fixed number of independent trials, each with the same probability of success. The two parameters are n (the number of trials) and p (the probability of success on each trial). If you flip a coin 10 times and want to know the probability of getting exactly 7 heads, the binomial distribution gives you that answer. The key requirements: every trial is independent, and p stays the same from trial to trial. Quality control inspections, survey response rates, and clinical trial outcomes are all classic binomial scenarios.

One wrinkle worth knowing: the binomial distribution assumes sampling with replacement, meaning each trial doesn’t change the odds for the next one. If you’re drawing items from a small batch without putting them back, the probabilities shift after each draw, and a different distribution called the hypergeometric distribution applies instead.

The Poisson Distribution

The Poisson distribution counts how many times something happens during a fixed interval of time or space. Hospital emergency room visits per day, earthquakes per year, insects per leaf in an orchard: these are all Poisson territory. It has a single parameter, lambda, which represents the average rate of occurrence. A notable quirk of the Poisson is that its mean and its variance are both equal to lambda. So if a hospital averages 5 emergency visits per day, the spread of that count around the average is also described by 5.

Three assumptions make the Poisson work. Events in separate time windows must be independent of each other (a busy Monday doesn’t make Tuesday busier). The probability of an event depends only on the length of the time window, not when that window falls. And two events can’t happen at the exact same instant. When these conditions hold, the Poisson distribution is remarkably good at modeling rare or random events. It also serves as a useful approximation of the binomial distribution when the number of trials is large and the probability of success on each trial is very small.

The Geometric Distribution

Where the binomial counts successes in a fixed number of trials, the geometric distribution asks a different question: how many trials will it take to get the first success? Its only parameter is p, the probability of success on each trial. If you’re rolling a die and waiting for your first 6, the geometric distribution tells you the probability that it takes exactly 1 roll, exactly 2 rolls, exactly 10 rolls, and so on. The lower the probability of success, the more spread out the distribution becomes, reflecting longer expected waits.

Mean and Variance

Two numbers summarize any discrete distribution. The mean (or expected value) tells you the long-run average outcome. You calculate it by multiplying each possible value by its probability and adding the results. For a fair die, that’s (1 × 1/6) + (2 × 1/6) + … + (6 × 1/6) = 3.5. You’ll never actually roll a 3.5, but over thousands of rolls, your average will converge on it.

The variance measures how spread out the outcomes are around that mean. For each possible value, you take the difference from the mean, square it, multiply by the probability of that value, and sum everything up. A small variance means outcomes cluster tightly around the average. A large variance means they’re scattered. The square root of the variance gives you the standard deviation, which is often easier to interpret because it’s in the same units as the original data.

Visualizing Discrete Distributions

Discrete distributions show up as bar charts or dot plots rather than smooth curves. Each bar sits on a specific value (0, 1, 2, 3…) and the height of the bar represents that outcome’s probability. There are gaps between the bars because nothing can happen between the integers.

The cumulative distribution function, which shows the probability of getting a value less than or equal to some number, takes the form of a step function. It’s flat between possible values and then jumps up at each one. The size of each jump equals the probability of that specific outcome. If you’re reading one of these graphs, you can find the probability of any range of values by looking at where the steps land.

Real-World Applications

Discrete distributions are everywhere once you start looking. Research published in the Journal of Modern Applied Statistical Methods analyzed several real datasets using discrete models: monthly fatal accident counts in Dhaka over a five-year period, daily patient visits at a hospital, and earthquake frequency in Bangladesh spanning 35 years. In each case, the data consisted of whole-number counts per time period, fitting naturally into discrete frameworks.

More everyday examples include the number of customers who enter a store each hour, the number of typos on a printed page, the number of thunderstorms a city experiences per year, or the number of times your phone rings during dinner. Insurance companies use discrete distributions to model the number of claims filed per policyholder. Manufacturers use them to predict defect counts per production batch. Anytime you’re counting occurrences rather than measuring a quantity on a continuous scale, a discrete distribution is the right tool.