What Is Partial Autocorrelation in Time Series?

Partial autocorrelation measures the direct correlation between a time series and a specific lagged version of itself, after removing the effects of all intermediate lags. If you want to know how strongly today’s value relates to the value from, say, three days ago, partial autocorrelation strips out the influence of yesterday and the day before, giving you only the direct relationship at that three-day gap. This makes it one of the most practical tools for identifying which type of time series model fits your data.

How It Differs From Regular Autocorrelation

Regular autocorrelation (ACF) measures the total linear relationship between a time series and its past values at each lag. The problem is that correlations cascade. If today’s value is strongly tied to yesterday’s, and yesterday’s is strongly tied to the day before, then today will appear correlated with two days ago, even if there’s no direct connection. The ACF can’t tell you whether that lag-2 correlation is real or just a side effect of the lag-1 relationship.

Partial autocorrelation (PACF) solves this by isolating the direct effect at each lag. At lag k, it gives you the correlation between the current observation and the observation k steps back that is not accounted for by lags 1 through k-1. Think of it like a controlled experiment: you’re holding all the closer lags constant and asking what additional predictive value the k-th lag provides on its own.

Why It Matters for Model Selection

The main practical use of partial autocorrelation is identifying the right time series model for your data. When building forecasting models, you typically choose among three types: autoregressive (AR) models that predict based on past values, moving average (MA) models that predict based on past forecast errors, and ARMA models that combine both. The PACF plot is your primary tool for determining AR order.

The key patterns to look for:

  • AR(p) process: The PACF cuts off sharply after lag p (values drop to near zero), while the ACF decays gradually. If you see significant PACF spikes at lags 1 and 2 but nothing beyond, your data likely follows an AR(2) model.
  • MA(q) process: The ACF cuts off after lag q, while the PACF decays gradually. This is the mirror image of the AR pattern.
  • ARMA(p,q) process: Both the ACF and PACF tail off gradually, which is why mixed models are harder to identify from plots alone and often require information criteria to pin down.

This symmetry is intentional. The PACF was designed to do for AR models exactly what the ACF does for MA models: provide a clean cutoff that tells you the model’s order.

Reading a PACF Plot

A PACF plot (sometimes called a partial correlogram) shows lag numbers on the horizontal axis and partial autocorrelation values on the vertical axis, ranging from -1 to 1. Most software draws horizontal dashed lines representing the 95% confidence interval, calculated as plus or minus 1.96 divided by the square root of your sample size. Any spike that extends beyond those lines is considered statistically significant.

The interpretation is straightforward. Look for the lag where the PACF values first drop inside the confidence bands and stay there. If significant spikes appear at lags 1, 2, and 3, but lag 4 and beyond are insignificant, that suggests an AR(3) model. A single significant spike at lag 1 followed by insignificance points to AR(1). If the PACF decays slowly rather than cutting off, you’re probably looking at an MA or ARMA process, and you should turn your attention to the ACF plot instead.

One thing to watch for: an isolated significant spike at a high lag (say lag 12 in monthly data) often indicates seasonality rather than a higher-order AR process. Context matters when interpreting these plots.

How PACF Is Calculated

There are several ways to compute partial autocorrelation, but they all share the same core logic: fit a series of increasingly complex regression models and extract the coefficient on the last lag in each one.

The most intuitive method works like this. First, regress the time series on just one lag and record that coefficient. That’s the PACF at lag 1 (which always equals the regular autocorrelation at lag 1). Then regress on lags 1 and 2 together, and record only the coefficient on lag 2. That’s the PACF at lag 2. Continue this process, each time adding one more lag to the regression and keeping only the last coefficient. Each retained coefficient captures the additional explanatory power of that lag after all closer lags have already been accounted for.

The more efficient approach uses recursive algorithms. The Yule-Walker equations express the relationship between autocorrelation values and model coefficients as a system of linear equations. At each step, you solve this system for one additional lag, and the last coefficient in the solution gives you the PACF at that lag. The Durbin-Levinson algorithm makes this recursion particularly efficient by updating the previous step’s solution rather than solving the entire system from scratch each time.

Computing PACF in Practice

In Python, the statsmodels library provides a dedicated function that handles all of this automatically. You pass in your time series and the number of lags you want, and it returns the partial autocorrelation values. The default method uses Yule-Walker estimation, but you can also choose ordinary least squares regression, Levinson-Durbin recursion, or Burg’s method. Based on simulation studies across a range of time series models, Yule-Walker (maximum likelihood), Levinson-Durbin (maximum likelihood), and Burg’s method produce the lowest root mean squared error. The bias-adjusted variants of Yule-Walker and Levinson-Durbin actually performed worse.

In R, the pacf() function in the base stats package does the same job. Both languages also offer combined plot functions that display ACF and PACF side by side, which is the standard way to visually assess your data before choosing a model.

A common workflow is to first difference your series if it’s non-stationary, then plot both the ACF and PACF, identify the cutoff patterns, and use those to select candidate AR, MA, or ARMA models. You then compare candidates using information criteria or out-of-sample forecast accuracy to make a final choice.

A Concrete Example

Suppose you’re analyzing daily temperature readings. You’d expect today’s temperature to be closely related to yesterday’s, somewhat related to two days ago, and so on. The regular ACF would show a slow, gradual decay across many lags, because each day is indirectly connected to every previous day through the chain of successive days.

The PACF, however, might show a large spike at lag 1, a smaller but still significant spike at lag 2, and then nothing significant beyond that. This tells you that once you know the temperatures from the past two days, knowing three or four days back doesn’t add useful information. Your data follows something close to an AR(2) process, and you’d build your forecast model accordingly. Without the PACF, you’d be staring at a slowly decaying ACF with no clear indication of how many lags to include.