How to Calculate Prevalence: Formula and Examples

Prevalence is calculated by dividing the total number of people with a disease or condition by the total population at risk, then multiplying by a convenient number (like 100 or 1,000) to make the result easier to interpret. The core formula is straightforward, but getting an accurate result depends on choosing the right numerator, denominator, and time frame.

The Basic Formula

Prevalence equals the number of existing cases divided by the population at risk during the same time period. Expressed as an equation:

Prevalence = (All cases of disease ÷ Population at risk) × 10ⁿ

The numerator includes every person who has the condition, whether they were just diagnosed or have been living with it for years. This is the critical distinction between prevalence and incidence, which only counts new cases. The denominator is the population that could potentially have the condition. And 10ⁿ is a multiplier you choose to make the number readable.

Choosing the Right Multiplier

The raw result of dividing cases by population is often a very small decimal. To make it useful, you multiply by a power of 10. The most common choice is 100, which gives you a percentage. If 50 out of 1,000 people in your sample have high blood pressure, dividing gives you 0.05, and multiplying by 100 gives you 5%.

For rarer conditions, percentages become awkwardly small, so researchers use larger multipliers. You might see prevalence reported per 1,000, per 10,000, or per 100,000 people. A condition affecting 3 people out of 100,000 is clearer as “3 per 100,000” than as “0.003%.” The multiplier you pick depends on convention in your field and whatever makes the number easiest to communicate.

Point, Period, and Lifetime Prevalence

There are three main types of prevalence, and each one defines the time frame differently.

Point prevalence captures a snapshot. It measures how many people have the condition at one specific moment. The denominator is the population at that same moment. If you survey a school on March 1 and find that 12 out of 400 students currently have the flu, the point prevalence is 12 ÷ 400 = 0.03, or 3%.

Period prevalence widens the window to a stretch of time, often 12 months. The numerator includes everyone who had the condition at any point during that period, whether they already had it when the period started or developed it along the way. The denominator is typically the average or mid-interval population. Period prevalence will always be equal to or higher than point prevalence because it captures more cases.

Lifetime prevalence asks whether a person has ever had the condition at any point in their life. This is commonly used in mental health research. The National Institute of Mental Health, for example, reports lifetime prevalence of disorders like depression based on representative survey samples. The number can be surprisingly high compared to point prevalence because it accumulates every case across a full lifespan.

Defining the Population at Risk

The most common mistake in calculating prevalence is using the wrong denominator. The denominator should only include people who are actually capable of having the condition. If you’re calculating the prevalence of ovarian cancer, the denominator should be the female population, not the total population. Using the entire U.S. population of roughly 330 million instead of the approximately 170 million females would cut your prevalence estimate nearly in half, giving a misleadingly low figure.

This principle applies broadly. If you’re calculating prevalence of a workplace injury among construction workers, your denominator is construction workers, not all employed people. If you’re studying a disease that only affects adults, children shouldn’t be in your denominator. Getting this wrong doesn’t just produce an inaccurate number; it distorts any decisions made from that number, from funding allocation to staffing at clinics.

A Step-by-Step Example

Say you want to calculate the point prevalence of diabetes in a city. Here’s what you need and how to work through it:

Step 1: Define your condition clearly. Are you counting Type 2 diabetes only, or all types? Are you including diagnosed and undiagnosed cases, or only confirmed diagnoses?
Step 2: Count your cases. Identify every person currently living with the condition at your chosen point in time. Suppose you find 8,400 people with diagnosed diabetes on January 1.
Step 3: Determine your population at risk. The city has 120,000 adult residents (excluding children, who rarely develop Type 2 diabetes).
Step 4: Divide and multiply. 8,400 ÷ 120,000 = 0.07. Multiply by 100 to get 7%. You can report this as “7% prevalence” or equivalently as “70 per 1,000 adults.”

Why Prevalence Differs From Incidence

Prevalence and incidence answer different questions. Incidence tells you how fast new cases are appearing. Prevalence tells you how many total cases exist right now. The numerator is where they diverge: incidence counts only new cases during a time period, while prevalence counts all cases, new and preexisting.

This means prevalence is shaped by two forces: how often people get sick (incidence) and how long they stay sick (duration). A disease with high incidence but short duration can have low prevalence. During a flu epidemic, many people get infected each week, but most recover within days, so at any given moment the proportion of the population currently sick stays relatively low. A chronic condition like cancer works the opposite way. Fewer people are diagnosed each year, but because they live with the disease for a long time, the total number of cases accumulates and prevalence rises.

This relationship is sometimes expressed as: Prevalence ≈ Incidence × Average duration of disease. It’s an approximation that works best when a disease is stable in a population, but it helps illustrate why prevalence can climb even when new cases hold steady, or drop even when new cases stay the same (if better treatments shorten the duration).

Factors That Raise or Lower Prevalence

Anything that adds cases to the numerator or removes them increases or decreases prevalence. New diagnoses increase it. So do longer survival times, because people remain in the “cases” count for more years. Immigration of people who already have the condition increases it. On the other side, deaths among people with the condition decrease prevalence, as do cures, recoveries, and emigration of affected individuals.

This creates a sometimes counterintuitive situation. A breakthrough treatment that keeps people alive longer will actually increase prevalence, even though it’s a positive development. Conversely, a highly fatal disease may show low prevalence simply because affected individuals don’t survive long enough to be counted. Prevalence alone doesn’t tell you whether a health situation is getting better or worse. You need incidence, mortality, and recovery data alongside it to get the full picture.

Using Survey Samples Instead of Full Populations

In practice, you rarely have health data on an entire population. Instead, researchers estimate prevalence using a randomly selected sample. The logic is the same: divide the number of people in the sample who have the condition by the total number of people in the sample. If the sample is truly representative of the larger population, the prevalence you calculate from it will closely reflect the true prevalence.

The key word is “representative.” If your sample over-represents certain groups (say, people who visit doctors frequently), your prevalence estimate will be biased upward. Random selection methods help ensure the sample mirrors the real population. Survey-based prevalence estimates are how major health statistics are produced, including national figures on mental health conditions, chronic diseases, and disability rates.