In statistics, a bin is a range of values that groups continuous data into smaller, more manageable intervals. If you have 1,000 test scores ranging from 0 to 100, you could sort them into bins like 0–10, 11–20, 21–30, and so on, turning a sprawling dataset into a compact summary you can actually read and visualize. Bins are the building blocks of histograms, and they show up in data preprocessing for machine learning, survey analysis, and nearly any situation where raw numbers need to be organized into patterns.
How Bins Work
Think of bins as buckets lined up along a number line. Each bucket covers a specific range, and every data point gets dropped into whichever bucket its value falls within. A dataset of daily temperatures ranging from 4°C to 36°C might be split into three bins: 4–11, 12–26, and 27–36. Once sorted, you can count how many data points landed in each bin and immediately see which temperature range was most common.
The width of each bin is the size of the range it covers. A bin spanning 0 to 20 has a width of 20. Bins are typically equal in width, though unequal widths are possible. When bins aren’t equal, the height of each bar in a histogram needs to represent density (count per unit of width) rather than raw count, or the visual will be misleading.
There’s also a small but important detail about boundaries. If one bin covers 20–40 and the next covers 40–60, where does a value of exactly 40 go? The standard convention is that each bin includes its left boundary but excludes its right boundary. So a bin written as [20, 40) contains 20 but not 40. The last bin in the series includes both endpoints. This prevents any data point from being counted twice.
Why Bins Matter in Histograms
A histogram is the most common place you’ll encounter bins. Each bar represents one bin, and the bar’s height shows how many data points fall within that range. The pattern of bars reveals the shape of your data: whether it’s clustered in the middle, skewed to one side, or spread evenly.
The number of bins you choose dramatically changes what the histogram tells you. Too few bins and the histogram becomes oversmoothed. It hides meaningful patterns, merges outliers with nearby values, and makes it impossible to see how the data is really distributed. A dataset that actually has two peaks might look like it has one. Too many bins create the opposite problem, called undersmoothing. The histogram looks jagged and random, showing noise rather than signal. Every small fluctuation gets its own bar, making genuine patterns harder to spot.
The goal is a middle ground: enough bins to reveal the true shape of the data, but not so many that random variation dominates the picture. For a dataset with clear clusters, you want enough resolution to see each cluster separately. For smooth, bell-shaped data, fewer bins often work fine.
Common Rules for Choosing Bin Count
Rather than guessing, statisticians use formulas to pick a reasonable starting number of bins. None of these are perfect for every situation, but they give you a defensible starting point.
Sturges’ Rule is the simplest. The number of bins equals 1 + log₂(n), where n is the number of data points. For 100 observations, that gives about 8 bins. For 1,000 observations, about 11. The formula assumes your data is roughly bell-shaped, because it’s derived from the idea that a normal distribution can be approximated by a specific pattern of bin heights. It tends to undercount bins for large datasets or data that isn’t normally distributed.
Scott’s Rule calculates bin width instead of bin count. The width equals 3.49 × s × n⁻¹/³, where s is the standard deviation and n is the sample size. You then divide your data’s total range by that width to get the number of bins. Because it uses standard deviation, Scott’s Rule is sensitive to outliers that inflate that measure.
The Freedman-Diaconis Rule swaps standard deviation for the interquartile range (IQR), which is the span of the middle 50% of your data. The bin width equals 2 × IQR × n⁻¹/³. This makes it more robust when your data has extreme values or a skewed shape, since the IQR isn’t pulled around by outliers the way standard deviation is.
Most statistical software picks one of these rules as a default. If the result doesn’t look right, adjusting up or down by a few bins is perfectly reasonable.
Bins in Data Preprocessing
Bins aren’t just for visualization. In machine learning and data analysis, binning is a preprocessing technique that converts continuous numerical data into categories. A feature like “outside temperature” with values from 4 to 36 might be split into three bins: cold (4–11), moderate (12–26), and hot (27–36). The model then treats temperature as three separate categories rather than a single continuous number, learning a different relationship for each range.
This is especially useful in two situations. First, when the relationship between a variable and the outcome isn’t linear. If the number of shoppers at a store peaks in moderate temperatures but drops in both cold and hot weather, a straight-line model can’t capture that pattern, but a binned version can. Second, when values are naturally clustered into groups. If most temperatures in your data fall into a few tight ranges with gaps between them, bins reflect that structure directly.
Even though the original data is a single column of numbers, binning effectively creates multiple features. A model learns a separate weight for each bin, giving it flexibility to handle patterns that a single linear relationship would miss.
Binning Also Reduces Noise
Another use of binning is smoothing out small measurement errors. When individual data points carry minor inaccuracies, grouping them into bins and replacing each value with the bin’s average (mean or median) reduces the impact of those errors. A sensor that reads 22.3°C one second and 22.7°C the next might be measuring the same real temperature. Binning both readings into a 22–23 range, or replacing both with the bin’s central value, removes that meaningless variation.
This tradeoff is fundamental: binning always sacrifices some precision in exchange for a clearer, simpler picture of the data. The original exact values are gone, replaced by the bin they belong to. Whether that tradeoff is worth it depends on whether the fine-grained detail matters for your question or whether the broader pattern is what you need.

