How to Find the Frequency of a Data Set in Statistics

Finding the frequency of a data set means counting how many times each value appears. For a small list of numbers, you can tally each value by hand. For larger or more complex data sets, you’ll organize values into a frequency distribution table, which groups and counts your data so patterns become visible. The process changes slightly depending on whether your data has a few distinct values or spans a wide range.

Counting Frequency in a Simple Data Set

Start with the most straightforward case: a short list with a limited number of distinct values. Suppose your data set is {3, 7, 3, 2, 7, 7, 5, 2, 3, 5}. To find the frequency of each value, list every unique value once, then count how many times it appears.

2 appears 2 times
3 appears 3 times
5 appears 2 times
7 appears 3 times

That’s it. Each count is the frequency for that value. This approach works well when your data has only a handful of unique entries, which is common with survey responses, letter grades, or categorical labels. It’s sometimes called an ungrouped frequency distribution because you’re not combining values into ranges.

When to Group Your Data Into Classes

Ungrouped tallies become impractical when the variable stretches over a wide range or you have a large number of observations. If you recorded the ages of 500 people and got values from 18 to 92, listing each individual age would produce a table so long it wouldn’t reveal much. Grouped frequency distributions solve this by sorting values into intervals (called classes or bins), like 18–27, 28–37, and so on, then counting how many values fall in each interval.

The general rule: if your data is a manageable list of repeated values, keep it ungrouped. If the values are spread out and numerous, group them.

Building a Grouped Frequency Table

Creating a grouped frequency distribution follows a consistent set of steps. Walk through them in order and you’ll have a clean table at the end.

1. Find the range. Identify the largest and smallest values in your data set, then subtract: Range = Maximum − Minimum. If your data runs from 12 to 87, the range is 75.

2. Choose the number of classes. Somewhere between 5 and 20 classes usually works. Too few and you lose detail; too many and the table gets unwieldy. For a quick estimate, take the square root of the total number of data points and round up. So for 100 observations, start with 10 classes. Another common guideline, known as Sturges’ rule, uses the formula: number of classes = 1 + log₂(n), where n is the sample size. For 100 data points, that gives about 7 or 8 classes.

3. Calculate the class width. Divide the range by the number of classes and round up to a convenient number. With a range of 75 and 8 classes, you’d get 75 ÷ 8 = 9.375, which you’d round up to 10.

4. Set the class limits. Pick a starting point less than or equal to your minimum value. This becomes the lower limit of your first class. Add the class width to get each subsequent lower limit. The upper limit of each class is one less than the lower limit of the next class. With a starting point of 10 and a width of 10, your classes would be 10–19, 20–29, 30–39, and so on.

5. Tally and count. Go through your data and place each value into the correct class. The count for each class is its frequency.

Relative Frequency and Cumulative Frequency

Raw frequency tells you the count, but sometimes you need proportions or running totals. These two extensions are easy to calculate once you have your basic frequency table.

Relative frequency shows what fraction of the total each value or class represents. The formula is simple: divide the frequency of a class by the total number of data points. If a class has a frequency of 10 and your data set contains 20 observations, the relative frequency is 10 ÷ 20 = 0.5, or 50%. When you add up all the relative frequencies, they should total 1 (or 100%).

Cumulative frequency is a running total. For each class, add its frequency to the sum of all previous classes. Using a quick example:

Class 5–9: frequency 10, cumulative frequency 10
Class 10–14: frequency 2, cumulative frequency 12
Class 15–19: frequency 4, cumulative frequency 16
Class 20–24: frequency 3, cumulative frequency 19
Class 25–29: frequency 1, cumulative frequency 20

The last cumulative frequency always equals the total number of data points. Cumulative frequency is useful when you want to answer questions like “how many observations fall below a certain value?” You can also compute cumulative relative frequency by dividing each cumulative frequency by the total sample size. The final class will always reach 100%.

Visualizing Frequency

Numbers in a table are precise, but a chart often makes the pattern obvious at a glance. The two most common options are histograms and frequency polygons.

A histogram uses bars whose heights represent the frequency of each class. The bars touch each other (no gaps), which signals that the data is continuous and the classes are adjacent. Histograms work well for displaying large data sets and make it easy to spot where values cluster, whether the distribution is symmetric, and where outliers sit.

A frequency polygon plots a point at the midpoint of each class at the height of its frequency, then connects the points with straight lines. It functions like a line graph and is especially useful when you want to compare two distributions on the same chart, since overlapping lines are easier to read than overlapping bars.

Finding Frequency in Excel

Excel has a built-in FREQUENCY function that counts how many values fall into each bin. The syntax is:

FREQUENCY(data_array, bins_array)

The first argument is the range of cells containing your data. The second is a list of bin boundaries (the upper limit of each class). If your data is in cells A1:A50 and your bin edges are in B1:B5, you’d enter =FREQUENCY(A1:A50, B1:B5). In current versions of Microsoft 365, just press Enter and the results will spill into adjacent cells automatically. In older versions, you need to select the entire output range first, type the formula, then press Ctrl+Shift+Enter to confirm it as an array formula.

The function returns one more value than the number of bins. That extra value counts anything above your highest bin, which is helpful for spotting data points that fall outside your expected range.

Finding Frequency in Python

If you’re working in Python, the pandas library makes frequency calculations a one-liner. The value_counts method on a Series returns each unique value paired with its count, sorted from most to least frequent.

For a quick example: if you have a Series called s, calling s.value_counts() gives you the absolute frequency of every value. To get relative frequencies instead, pass normalize=True. This divides each count by the total, returning proportions. A value that appears in 4 out of 10 entries would show as 0.4.

For grouped frequency distributions with continuous data, you can use pandas.cut() to define bins and then apply value_counts to the result. NumPy’s histogram function offers similar binning and is useful when you need both the counts and bin edges for plotting.

Choosing the Right Number of Bins

The number of classes you pick directly shapes what your frequency distribution reveals. Too few bins and real patterns get hidden inside overly broad groups. Too many bins and random noise looks like meaningful variation. Several rules of thumb exist to guide this choice.

The square root rule is the simplest: take the square root of the sample size and round up. For 200 data points, that gives 15 bins. Sturges’ rule (1 + log₂n) tends to produce fewer bins and works well for roughly symmetric distributions but can undercount bins when data is skewed. The Rice University rule, which doubles the cube root of the sample size, often lands between the other two and is a solid default. For 1,000 observations, it suggests about 20 bins.

None of these rules is perfect for every situation. Treat them as starting points, then adjust. If your histogram looks jagged and hard to interpret, try fewer bins. If it looks like a single flat block, try more.