What Is a Bin in a Histogram and How Do You Choose One?

A bin in a histogram is a range of values that groups your data into intervals. Each bin becomes one bar on the chart: the width of the bar shows the range of values it covers, and the height shows how many data points fall within that range. If you have exam scores from 0 to 100 and you create bins of width 20, the first bin captures everyone who scored between 0 and 20, the next bin covers 21 to 40, and so on. The result is a picture of how your data is distributed.

How Bins Turn Raw Numbers Into a Pattern

Continuous data, like temperatures, ages, or test scores, can take on a huge number of individual values. Plotting every single value would give you a cluttered mess. Bins solve this by grouping nearby values together so you can see the overall shape of the data at a glance.

Think of a geology class where the instructor posts quiz scores as a histogram. Each bar might represent a 10-point range: 60 to 70, 70 to 80, 80 to 90, and so on. If you scored an 83, you can quickly see that most of the class also landed in the 80 to 90 range. You can also spot whether the distribution is lopsided, has two peaks (like half the class earning A-minuses and the other half earning C-pluses), or follows a bell curve.

The same logic applies outside the classroom. Ocean researchers comparing wave heights at two beaches might use bins of 0.5 meters. A histogram for Coos Bay, Oregon, might show the most common wave height is about 2 meters across 15,000 recorded observations, while Bill Baggs State Park in Florida peaks at just 0.5 meters across 80,000 observations. Hydrologists tracking river discharge might bin their data into ranges like 33 to 75 cubic feet per second, 75 to 117, 117 to 159, and so on. In every case, bins are what make the pattern visible.

Choosing the Right Number of Bins

The number of bins you pick changes what the histogram reveals. Too few bins and you lump so much data together that interesting features disappear. Too many bins and every bar looks like random noise, because each one contains only a handful of data points. The goal is to choose enough bins to capture the major features in the data while smoothing over random sampling fluctuations.

A simple starting point is to divide the range of your data (the difference between the largest and smallest values) by the square root of your sample size. If you measured 100 values ranging from 10 to 26.3, the square root of 100 is 10, and the range divided by 10 gives a bin width of about 1.63.

Three more formal rules exist, each suited to different situations:

  • Sturges’ rule sets the number of bins based on the logarithm of your sample size. It works best when the data roughly follows a bell curve and the dataset is not extremely large.
  • Scott’s rule calculates bin width using the spread (standard deviation) of your data and the cube root of the sample size. It still leans on the assumption of a bell-shaped distribution, but less rigidly than Sturges’ rule.
  • Freedman-Diaconis rule replaces the standard deviation with the interquartile range, which is the span covering the middle 50% of your data. Because the interquartile range ignores extreme values, this rule handles outliers and non-bell-shaped data better than the other two.

None of these rules is sacred. They give you a reasonable starting point, and you adjust from there based on what the histogram looks like.

What Happens at Bin Boundaries

When a data point lands exactly on the edge between two bins, the software has to decide which bin gets it. The most common convention is that each bin includes its left edge but not its right edge, except for the very last bin, which includes both edges. So if your bins run from 0 to 10, 10 to 20, and 20 to 30, a value of exactly 10 goes into the second bin, and a value of exactly 30 stays in the third. This “left-closed, right-open” rule prevents any value from being counted twice or left out entirely.

Most tools handle this automatically, so you rarely need to think about it. But if you’re building a histogram by hand or debugging a chart that looks off by one count in a bar, boundary rules are the first thing to check.

Equal vs. Unequal Bin Widths

Most histograms use bins of equal width, which makes them easy to read: taller bars simply mean more data points. When bins have unequal widths, the visual comparison breaks down. A wide bin naturally collects more data points than a narrow one, so a tall, wide bar could be misleading. To compensate, histograms with unequal bin widths typically switch the vertical axis from raw counts to density, where the area of each bar (not its height) represents the proportion of data in that bin.

Unequal bin widths are rarely a good idea for everyday analysis. They occasionally appear when you want finer detail in one part of the range and coarser grouping elsewhere, but equal-width bins are the standard for a reason: they’re straightforward to create and interpret.

How Software Picks Bins by Default

If you drop data into a plotting tool without specifying bins, the software will choose for you. Python’s Matplotlib, one of the most widely used charting libraries, passes your data to NumPy’s histogram function, which can automatically select bin counts using rules like Sturges’ or Freedman-Diaconis. You can override this by passing a specific number, like requesting four bins, or by letting the algorithm decide with an “auto” setting.

Excel, Google Sheets, and similar tools also generate default bins, though they tend to offer less control over the algorithm used. Regardless of the tool, the default is just a starting point. Adjusting the bin count up or down by a factor of two and watching how the shape changes is one of the fastest ways to understand your data.

Practical Tips for Better Histograms

Start with a moderate number of bins and then experiment. If the histogram looks like a flat plateau with no peaks, you likely have too few bins. If it looks like a jagged skyline with no clear pattern, you have too many. Somewhere in between, the true shape of the data emerges: a single peak, two peaks, a long tail to one side, or a roughly uniform spread.

Keep your bin widths in round, human-friendly numbers. Bins that span 0 to 10, 10 to 20, and 20 to 30 are easier to interpret than bins running from 0 to 7.3, 7.3 to 14.6, and 14.6 to 21.9. If you’re comparing two histograms, use the same bin edges for both so the visual comparison is fair. And when you need to compare your data against a theoretical distribution like a bell curve, switch the vertical axis to density so the areas are directly comparable.