Making a time series stationary means transforming it so its statistical properties, specifically its mean, variance, and autocorrelation, stay consistent over time. Most forecasting models, including ARIMA, require this stability to estimate reliable coefficients and produce accurate predictions. Without it, the model is trying to learn patterns from a moving target. The core techniques are differencing, variance-stabilizing transformations, and detrending, and the right choice depends on what’s making your series non-stationary in the first place.
What Stationary Actually Means
A stationary time series has no predictable long-term patterns. If you plot it, the data will hover roughly around a flat horizontal line with consistent spread. There’s no upward or downward drift, no variance that fans out or contracts over time, and no seasonal humps that grow larger as the series progresses. Cyclic behavior is still allowed, as long as the cycles don’t have a fixed, predictable period.
The practical version most people work with is “weak stationarity,” which requires three things: a constant mean, a constant variance, and an autocovariance that depends only on the lag between two points, not on where you are in the series. If your data violates any of these, you need to fix the specific violation before fitting a model.
How to Check for Stationarity
Before transforming anything, confirm that your series is actually non-stationary and identify what kind of non-stationarity you’re dealing with. There are two main approaches: visual inspection and formal statistical tests.
Visual Inspection
Plot a rolling mean and rolling standard deviation over a window that makes sense for your data (commonly 12 periods for monthly data). In a stationary series, the rolling mean fluctuates around a relatively constant value and the rolling standard deviation stays flat. If the rolling mean drifts upward or downward, you have a trend. If the rolling standard deviation grows over time, you have non-constant variance.
ADF and KPSS Tests
The two standard statistical tests are the Augmented Dickey-Fuller (ADF) test and the KPSS test, and they work in opposite directions. The ADF test assumes the series is non-stationary (has a unit root) and tries to reject that assumption. If the p-value is below 0.05, you can conclude the series is stationary. The KPSS test assumes the series is stationary and tries to reject that. If its p-value is below 0.05, you have evidence the series is not stationary.
Running both tests together gives you more diagnostic power than either one alone. There are four possible outcomes:
- Both say stationary (ADF rejects, KPSS doesn’t): the series is stationary, no transformation needed.
- Both say non-stationary (ADF doesn’t reject, KPSS rejects): the series is clearly non-stationary.
- ADF says non-stationary, KPSS says stationary: the series is trend stationary. You need to remove the trend (detrending) rather than difference.
- ADF says stationary, KPSS says non-stationary: the series is difference stationary. Differencing is the right approach.
Stabilize Variance First
If your data’s spread increases as its level increases (common in financial data, population counts, and sales figures), you need to stabilize the variance before addressing the mean. Differencing a series with non-constant variance will not fix the variance problem and can actually distort the results.
A logarithmic transformation is the simplest and most common fix. Taking the log of each value compresses large values more than small ones, which pulls in the widening spread. This works well when the standard deviation of the series is roughly proportional to its level.
For cases where a log transform isn’t quite right, the Box-Cox transformation offers a more flexible option. It introduces a parameter (lambda) that controls how aggressively the transformation compresses values. When lambda equals zero, the Box-Cox transformation is identical to a log transform. When lambda equals one, no transformation is applied. Most statistical software can automatically select the optimal lambda value for your data. The goal is to find the value that makes the variance independent of the mean.
Remove Trends With Differencing
Differencing is the most widely used method for removing trends. First-order differencing replaces each value with the change from the previous value: the new value at time t becomes the original value at t minus the original value at t-1. This eliminates a constant upward or downward drift. A model with one order of non-seasonal differencing essentially assumes the original series has a constant average trend, like a random walk.
If first-order differencing doesn’t produce stationarity (check with ADF/KPSS again), you can apply second-order differencing, which differences the already-differenced series. This handles quadratic-type trends. In practice, you rarely need more than two orders of total differencing.
Seasonal Differencing
When your data has a repeating seasonal pattern, standard differencing won’t remove it. Seasonal differencing subtracts the value from the same season in the previous cycle. For monthly data with a yearly pattern, the seasonal difference at time t is the value at t minus the value at t-12. If this seasonal difference looks like pure noise with constant variance and no autocorrelation, the original series follows a seasonal random walk model.
You can combine seasonal and non-seasonal differencing. For monthly data, applying both first-order and seasonal differencing produces a value equal to (Y(t) – Y(t-1)) – (Y(t-12) – Y(t-13)). This removes both the trend and the seasonal component simultaneously. A practical rule: never use more than one order of seasonal differencing or more than two orders of total differencing (seasonal plus non-seasonal combined).
Remove Trends With Detrending
When the ADF/KPSS combination suggests your series is trend stationary rather than difference stationary, detrending through regression is the better approach. The idea is straightforward: fit a line (or curve) to your data as a function of time, then subtract that fitted trend. What remains, the residuals, should be stationary.
For a linear trend, you fit a simple regression where time is the predictor. The model estimates a slope (how much the series increases per time step) and an intercept. You subtract the predicted value at each time point from the actual value. The residuals represent the fluctuations around the trend, and these are what you model going forward.
Linear detrending works when the trend is genuinely linear. For curves, you can use polynomial regression or more flexible smoothing methods. After detrending, check the residuals for constant variance and run the stationarity tests again to confirm the job is done.
How to Tell if You’ve Over-Differenced
More differencing is not always better. Over-differencing introduces artificial patterns and noise into your data, making your model worse rather than better. There are several clear warning signs.
The most reliable indicator is the lag-1 autocorrelation of the differenced series. If it’s more negative than -0.5, you’ve likely over-differenced. Visually, an over-differenced series looks random at first glance, but on closer inspection you’ll see excessive sign-switching: up, down, up, down, in a pattern that’s too regular to be natural variation.
Another telling symptom is the standard deviation. Each round of differencing should reduce the standard deviation. If applying another round of differencing increases it instead, that’s a strong signal to step back. In one commonly cited example, the standard deviation jumped from 1.54 to 1.81 after an unnecessary round of differencing.
If you’ve mildly over-differenced, you can compensate by adding moving average (MA) terms to your model rather than undoing the differencing. Similarly, mild under-differencing can be compensated with autoregressive (AR) terms. But getting the differencing order right in the first place saves you from relying on these workarounds.
Putting the Steps Together
A practical workflow looks like this. Start by plotting the series and its rolling statistics to identify what kind of non-stationarity you’re dealing with: trending mean, expanding variance, seasonality, or some combination. Run both ADF and KPSS tests to confirm your visual assessment and determine whether the series is trend stationary or difference stationary.
If the variance is clearly non-constant, apply a log or Box-Cox transformation first. Then address the mean: use differencing for a stochastic trend or regression detrending for a deterministic one. If seasonal patterns are present, apply seasonal differencing. After each step, re-check with plots and tests. Stop as soon as the series passes both stationarity tests and the rolling statistics look flat. The goal is the minimum transformation needed, not the maximum.

