A density plot is a smoothed, continuous curve that shows how values in a dataset are distributed. Think of it as a refined version of a histogram: instead of stacking data into rigid bars, it draws a flowing line that estimates the shape of your data’s distribution. The curve is generated through a technique called Kernel Density Estimation (KDE), which places a small, smooth bump around every data point and then adds all those bumps together to produce a single, continuous estimate of the overall pattern.
How a Density Plot Is Built
The process behind a density plot is surprisingly intuitive. Every single data point gets its own small, bell-shaped curve (called a kernel) centered on its value. These individual curves are then stacked on top of each other and averaged. Where many data points cluster together, the stacked kernels produce a tall peak. Where data points are sparse, the combined curve stays low. The result is a smooth line that reveals the underlying shape of the data without the choppy edges of a histogram.
Two ingredients control how the final curve looks: the kernel function and the bandwidth. The kernel function determines the shape of each individual bump. A Gaussian (normal distribution) shape is the most common choice, but other symmetric shapes work too. The bandwidth controls how wide each bump spreads. A narrow bandwidth keeps each bump tight around its data point, producing a jagged, highly detailed curve. A wide bandwidth spreads each bump further, producing a smoother, more generalized curve. Getting the bandwidth right is the single most important decision when creating a density plot.
Reading the Y-Axis
The y-axis on a density plot trips up a lot of people because it doesn’t show counts or percentages. It shows probability density, which is a different concept entirely. A value on the y-axis does not tell you the probability of any single point. Instead, probability comes from the area under the curve between two values on the x-axis. If you shade the region between, say, 10 and 20, the area of that shaded region tells you the proportion of data falling in that range.
The total area under the entire curve always equals 1, representing 100% of the data. This is what makes the y-axis values sometimes confusing. If your data is tightly packed into a narrow range on the x-axis, the curve has to rise steeply to keep the total area at 1, so the y-axis can easily exceed 1.0. A y-axis value of 40 is perfectly valid if the x-axis range is very small. The key rule: probability density values are not probabilities on their own. They only become probabilities when multiplied by a span of x-axis values (that is, when you calculate area).
Why Use a Density Plot Instead of a Histogram
Histograms require you to choose how many bins to use, and that choice can dramatically change what the data appears to show. Too many bins and the histogram looks noisy, full of spikes that reflect random variation rather than real patterns. Too few bins and the histogram looks oversimplified, hiding important features. This tradeoff between bias (missing real patterns) and variance (showing fake ones) is baked into the histogram’s design.
Density plots solve this problem in a more graceful way. Because KDE smooths each data point individually rather than dumping them into bins, the resulting curve is less sensitive to arbitrary choices. You still need to pick a bandwidth, which plays a role similar to bin width, but the output is a continuous curve that makes it much easier to spot the true shape of a distribution. Peaks, valleys, and skewness are all more visually obvious on a smooth curve than on a bar chart.
Density plots also shine when you want to compare multiple groups on the same chart. Overlaying two or three histograms creates a cluttered, hard-to-read graphic. Overlaying two or three density curves, each in a different color, is clean and immediately shows where the distributions overlap and where they differ.
Choosing the Right Bandwidth
Bandwidth selection matters more than any other setting when you create a density plot. Set it too low and your curve hugs every data point, producing a spiky line that overfits the noise in your sample (undersmoothing). Set it too high and your curve glosses over real features, flattening out peaks that genuinely exist in the data (oversmoothing).
Most software uses an automatic method so you don’t have to pick a number yourself. Scott’s rule estimates the bandwidth based on the spread of your data and the number of data points. Silverman’s rule is similar but adjusts slightly for the assumption that the data follows a roughly normal shape. Cross-validation is a more data-driven approach that tests different bandwidths and picks the one that best balances smoothness and accuracy. For most everyday analysis, the default bandwidth your software selects works well. If the curve looks too jagged or too flat, adjusting the bandwidth up or down by a small factor is usually enough.
Bivariate Density Plots
When you have two variables instead of one, a density plot can extend into two dimensions. The most common version uses color shading or contour lines to show where pairs of values concentrate. Think of it like a topographic map: areas of high density (where many data points cluster) appear as peaks, shown with darker colors or tightly packed contour rings. Areas of low density appear as valleys with lighter colors or no contour lines at all.
These 2D density plots are useful when a scatter plot becomes too crowded to interpret. If you have thousands or millions of data points, individual dots overlap into an unreadable blob. A bivariate density plot replaces that blob with a clear heat map showing exactly where the concentration is highest.
Creating Density Plots in Python and R
In Python, the Seaborn library’s kdeplot function is the most common tool. At its simplest, you pass in your data and an x variable, and the function handles the KDE math automatically. Key parameters include bw_adjust, which lets you scale the default bandwidth up or down, fill to shade the area under the curve, and hue to split the data by a categorical variable and draw separate curves for each group. Setting bw_adjust to 0.5 halves the default bandwidth for a more detailed curve; setting it to 2 doubles the bandwidth for a smoother one. You can also pass both x and y to create a bivariate density plot with contour lines.
In R, the standard approach is ggplot2‘s geom_density() layer. You map your variable to the x aesthetic, and the function draws a KDE curve. The adjust parameter works the same way as Seaborn’s bw_adjust, scaling the default bandwidth. To compare groups, you map a grouping variable to the color or fill aesthetic. Both libraries default to Scott’s rule for bandwidth selection, so out-of-the-box results are usually reasonable without any manual tuning.
When Density Plots Work Best
Density plots are ideal for continuous data where you want to understand shape: is the distribution symmetric, skewed, or bimodal (two peaks)? They are especially valuable when comparing distributions across groups, because overlapping curves communicate differences more clearly than side-by-side histograms. They also work well in exploratory analysis when you want a quick, clean picture of how a variable behaves before running any formal tests.
They are less useful for very small datasets, where the smoothing can create the illusion of structure that doesn’t really exist. With fewer than about 20 to 30 data points, a simple dot plot or strip plot often communicates the data more honestly. Density plots can also mislead with discrete or heavily bounded data, since the smoothing may extend the curve into impossible ranges, like negative values for a variable that can only be positive. In those cases, trimming the curve with a clipping parameter or switching to a histogram keeps the visualization grounded in reality.

