What Is LOWESS Smoothing in Statistics?

LOWESS smoothing, short for LOcally WEighted Scatterplot Smoothing, is a method for drawing a smooth curve through noisy data. Instead of fitting one equation to your entire dataset (like a straight line or a single curve), LOWESS fits many small equations to overlapping neighborhoods of points, then stitches them together. The result is a flexible line that follows the general shape of your data without being jerked around by every individual point.

The technique was developed by statistician William Cleveland in the late 1970s and has become one of the most widely used nonparametric regression methods in data analysis. You’ll sometimes see it called LOESS (LOcal regrESSion), which is a later generalization by the same author. In practice, most software uses the terms interchangeably.

How LOWESS Works

The core idea is simple: instead of asking “what single line best fits all 500 points?”, LOWESS asks “what line best fits the points near this spot?” for every spot in your data. Here’s what happens under the hood.

For each data point, LOWESS selects a neighborhood of nearby points. It then fits a low-degree polynomial (usually a straight line or a quadratic curve) to just that neighborhood, using weighted least squares. The weighting is the key trick: points close to the target get heavy influence on the fit, while points farther away contribute less and less. Once the local polynomial is fit, LOWESS reads off the smoothed value at that point and moves on to the next one. After doing this for every point in the dataset, the collection of smoothed values forms the final curve.

The weight assigned to each neighbor follows a specific pattern called the tri-cube function. A neighbor sitting right on top of the target point gets full weight. As the distance increases, the weight drops off slowly at first, then falls steeply, reaching zero at the edge of the neighborhood. This creates a smooth transition so that no single distant point can abruptly distort the local fit. The distances are scaled so the farthest point in each neighborhood sits exactly at the cutoff boundary.

The Bandwidth Parameter

The single most important setting in LOWESS is the bandwidth (also called the span or fraction), which controls how large each neighborhood is. If you set a bandwidth of 0.33, each local fit uses roughly one-third of your data points. A bandwidth of 0.80 uses 80% of them.

This creates a direct tradeoff. A small bandwidth means each fit uses only the closest points, so the curve can track tight bends and local peaks in your data. But it also picks up more noise, producing a jagged line. A large bandwidth pulls in points from farther away, producing a smoother, more stable curve, but one that may blur over genuine peaks and valleys. In statistical terms, smaller bandwidths reduce bias (the curve stays true to local patterns) at the cost of higher variance (more sensitivity to random fluctuations), while larger bandwidths do the opposite.

There’s no universally correct bandwidth. The right choice depends on how noisy your data is and how much local detail you want to preserve. Most analysts start with a moderate value and adjust up or down by visually inspecting the result.

Handling Outliers

One reason LOWESS became so popular is its built-in resistance to outliers. After the initial smoothing pass, the algorithm can perform additional iterations where it downweights points that had large residuals (points far from the smoothed curve). On each subsequent pass, the outlying points pull less on the local fits, so the curve settles closer to the bulk of the data rather than being dragged toward stray values.

In Python’s statsmodels library, for example, the default setting runs three of these robustness iterations. That’s usually enough to dampen the influence of moderate outliers without overcomplicating the computation.

When To Use LOWESS

LOWESS is most useful when you suspect the relationship between two variables isn’t a simple straight line or known curve, and you want the data itself to reveal the shape. Common situations include:

Exploratory analysis. Before committing to a specific model, LOWESS gives you a visual feel for how two variables relate. If the smoothed curve looks roughly linear, a simple regression might suffice. If it bends or flattens, you know to consider something more flexible.
Time series visualization. Environmental monitoring, stock prices, and sensor readings often have noisy trends. LOWESS can reveal the underlying direction without imposing a rigid functional form.
Residual diagnostics. After fitting a model, plotting LOWESS through the residuals can expose patterns you missed, like curvature or variance that changes across the range of your data.
Small to moderate datasets. Because LOWESS fits a separate regression at every point, it’s computationally heavier than global models. It works well for datasets in the hundreds to low thousands of points. For very large datasets, approximate or binned versions are available.

LOWESS is not a good choice when you need a predictive equation you can write down and reuse. It produces a set of smoothed values, not a formula. If you need to extrapolate beyond your data range or plug new values into a model, parametric approaches (linear regression, polynomial regression, splines with known knot positions) are better suited.

LOWESS vs. LOESS

Cleveland’s original 1979 method, LOWESS, fit only straight lines locally. His 1988 update, LOESS, generalized this to allow local polynomials of any degree (most commonly quadratic) and extended the method to multiple predictor variables. LOESS can better capture curvature within each neighborhood, which matters when your data has sharp bends.

In modern software, the distinction has largely dissolved. R’s built-in lowess() function uses the original linear-fit approach, while its loess() function defaults to quadratic fits. Python’s statsmodels provides a lowess() function that uses linear local fits by default. Most people say “LOWESS” or “LOESS” to mean the same general technique, and you can usually control the polynomial degree through a parameter regardless of which function you call.

Using LOWESS in Python and R

In Python, the statsmodels library provides a straightforward implementation. The basic call is lowess(y, x), where y is your response variable and x is your explanatory variable. The function returns an array of smoothed values. The default bandwidth fraction is 2/3 (about 0.667), meaning each local fit uses two-thirds of your data. You can tighten it with something like lowess(y, x, frac=1./3) for a more detailed curve.

The it parameter controls the number of robustness iterations, defaulting to 3. Setting it=0 skips the outlier-reweighting step entirely, which speeds things up if you’re confident your data is clean. A delta parameter lets you speed up computation on large datasets by skipping points that are very close together.

In R, lowess(x, y, f=2/3) provides the classic implementation with the same default bandwidth. The loess() function offers more control, including the ability to fit quadratic polynomials locally and work with multiple predictors. For quick visualization, ggplot2‘s geom_smooth(method="loess") adds a LOESS curve to a scatterplot in a single line of code.

Choosing the Right Bandwidth

If you’re not sure where to start, the default of 2/3 is deliberately conservative. It produces a fairly smooth curve that’s unlikely to overfit. From there, reduce the fraction if you see that the curve is too flat and missing obvious patterns in your data, or increase it if the curve is tracking noise you’d rather ignore.

Some formal methods exist for selecting an optimal bandwidth, including cross-validation (leaving out one point at a time and measuring prediction error) and plug-in estimators that balance bias and variance mathematically. In practice, though, most analysts rely on visual inspection. Plot the curve at two or three different bandwidths and choose the one that captures the pattern you care about without chasing individual data points.