What Is a Discrete Random Variable in Statistics?

A discrete random variable is a variable whose value is determined by chance, and it can only take on separate, countable values. Think of rolling a die: the result is random, but it can only be 1, 2, 3, 4, 5, or 6. You could list every possible outcome. That “listability” is what makes it discrete. The set of possible values can be finite (like a die) or countably infinite (like the number of emails you receive in a day, which theoretically has no upper limit), but the key is that the values are distinct and separated rather than flowing smoothly along a number line.

Discrete vs. Continuous Variables

The distinction comes down to one question: can you count the possible values, or not? If you can list them out, you have a discrete random variable. The number of customers in a store, the number of heads in 10 coin flips, the number of defective items in a shipment: all countable, all discrete.

A continuous random variable, by contrast, can take any value within a range. An animal’s weight might be 123.759 kilograms, or 123.75921 kilograms, or some even more precise measurement. Between any two values, there are infinitely many others. You could never list them all. Height, temperature, and time are continuous. The number of steps you take in a day is discrete.

This distinction matters because discrete and continuous variables use different mathematical tools. Discrete variables assign probabilities to individual values. Continuous variables assign probabilities to intervals, since the probability of landing on any single exact value (like weighing precisely 72.000000… kg) is essentially zero.

The Probability Mass Function

Every discrete random variable has a probability mass function, often abbreviated PMF. This is simply a rule that assigns a probability to each possible value the variable can take. If you roll a fair six-sided die, the PMF assigns a probability of 1/6 to each outcome from 1 through 6.

A valid PMF has two requirements. First, no probability can be negative. Every value must have a probability of zero or higher. Second, all the probabilities must add up to exactly 1. This makes intuitive sense: something has to happen, and the total chance of all possible outcomes combined is 100%.

You can visualize a PMF using a probability histogram, where each possible value sits on the horizontal axis and a vertical bar shows its probability. Unlike a bar chart for categories, the height of each bar directly represents a probability. Over many observations, the pattern of actual outcomes will closely match the shape of this histogram.

How to Calculate the Expected Value

The expected value is the long-run average of a discrete random variable if you observed it many, many times. To calculate it, you multiply each possible value by its probability, then add up all those products. For a fair die, that’s (1 × 1/6) + (2 × 1/6) + (3 × 1/6) + (4 × 1/6) + (5 × 1/6) + (6 × 1/6) = 3.5.

Notice the expected value doesn’t have to be a value the variable can actually take. You’ll never roll a 3.5, but over thousands of rolls, your average will converge toward it. The expected value tells you where the center of a distribution sits, not what any single outcome will be.

Measuring Spread With Variance

Variance measures how far a variable’s values typically fall from the expected value. To compute it, you take each possible value, find the difference between it and the expected value, square that difference, multiply by the value’s probability, and sum the results. A small variance means outcomes cluster tightly around the average. A large variance means they’re spread out.

There’s also a shortcut formula: compute the expected value of the squared variable, then subtract the square of the expected value. Both approaches give the same answer. The standard deviation is just the square root of the variance, which puts the spread back into the original units of measurement rather than squared units.

The Cumulative Distribution Function

While the PMF tells you the probability of each specific value, the cumulative distribution function (CDF) tells you the probability that the variable is less than or equal to some value. For a die, the CDF at 3 equals the probability of rolling a 1, 2, or 3, which is 3/6 or 0.5.

For discrete variables, the CDF looks like a staircase. It stays flat between possible values, then jumps up at each value by an amount equal to that value’s probability. Between 3 and 4 on a die roll, for instance, the CDF remains constant. It only steps up again when you reach 4. This step-function shape is a visual signature of discrete variables, distinguishing them from the smooth curves of continuous distributions.

Common Discrete Distributions

Several well-known probability distributions describe different types of discrete random variables. The most widely used include:

  • Bernoulli distribution: A single trial with two outcomes, success or failure. Flipping one coin is a Bernoulli trial.
  • Binomial distribution: The number of successes in a fixed number of independent trials, where each trial has the same probability of success. It applies when you flip a coin 20 times and count heads, test 100 products and count defects, or give a multiple-choice exam and count correct answers.
  • Poisson distribution: The count of events occurring in a fixed interval of time or space. It’s used to model things like the number of accidents in a city per month, the number of patient visits to a hospital per day, or the number of earthquakes in a region per year.
  • Geometric distribution: The number of trials needed to get the first success. If you keep rolling a die until you get a 6, the number of rolls follows a geometric distribution.

Each of these distributions has its own formula for calculating probabilities, but they all share the same foundation: a countable set of outcomes, each assigned a specific probability.

Real-World Examples

Discrete random variables appear everywhere data involves counting. The number of insurance claims filed per week, the number of patients admitted to an emergency room per day, and the number of goals scored in a soccer match are all discrete. In quality control, manufacturers track the number of defective units per batch. In finance, analysts model the number of loan defaults in a portfolio.

Research applications illustrate the range. One study analyzed the number of fatal traffic accidents per month in a district over 64 months. Another tracked daily patient visits to a hospital over several months. A third counted earthquakes in a region spanning 35 years. In each case, the variable of interest was a count: whole numbers with no values in between. That’s the practical hallmark of a discrete random variable. If your data comes from counting rather than measuring, you’re almost certainly working with one.