Is Convolution Linear? Yes, and Here’s Why It Matters

Yes, convolution is a linear operation. When you hold one function fixed and convolve it with different inputs, the result obeys both rules that define linearity: scaling an input scales the output by the same amount, and convolving with a sum of inputs gives the same result as convolving with each input separately and adding the results. This property is what makes convolution so central to signal processing, physics, and machine learning.

What “Linear” Means Here

An operation counts as linear if it satisfies two conditions. The first is scaling (sometimes called homogeneity): if you multiply your input by some constant, the output gets multiplied by the same constant. The second is additivity (sometimes called superposition): if you feed in two inputs added together, the output equals the sum of what you’d get from each input on its own.

For convolution, suppose you have a fixed function h and you convolve it with different inputs. Scaling works because multiplying the input by a constant just pulls that constant outside the integral (or sum, in discrete time). Additivity works because the integral of a sum is the sum of the integrals. In formal notation, the distributive property of convolution states that h convolved with (x₁ + x₂) equals h convolved with x₁ plus h convolved with x₂. Together, these two properties confirm that convolution with a fixed function is a linear operator.

Continuous and Discrete Versions

The continuous-time convolution of two functions f and g is defined as the integral of f(τ) times g(t − τ) over all values of τ. The discrete-time version replaces the integral with a sum: you multiply one sequence by a shifted, reversed copy of the other and add up all the products. Both versions are linear for the same fundamental reason. The integral and the summation are themselves linear operations, so anything built from them inherits linearity automatically.

In the discrete case, a signal x[n] can be broken into a weighted combination of shifted impulses. If the system is linear, its response to that combination equals the same weighted combination of individual impulse responses. That’s exactly what the convolution sum computes. The output at each time step is the sum of x[k] times h[n − k] over all k, which is precisely the definition of discrete convolution.

Why Linearity Matters for LTI Systems

In engineering and physics, a linear time-invariant (LTI) system is one where the rules don’t change over time and the output scales and adds in proportion to the input. These two properties, linearity and time-invariance, unlock a powerful result: if you know how the system responds to a single instantaneous impulse, you can predict its response to any input whatsoever. That prediction is carried out through convolution.

The logic works like this. You break an arbitrary input signal into tiny packets, each concentrated around a single moment in time. Each packet approximates a scaled, shifted impulse. Because the system is linear, its response to the full signal is the sum of its responses to each packet individually. Because the system is time-invariant, each packet’s response is just a shifted copy of the impulse response, scaled by the input’s value at that moment. Add all those shifted, scaled copies together, and you get the convolution integral. MIT’s course materials describe this reasoning as “no more and no less than an integral expression of the principle of superposition.”

This is why convolution appears everywhere in audio processing, image filtering, and control systems. Any system that behaves linearly and consistently over time can be fully characterized by its impulse response, and convolution is the tool that turns that impulse response into a prediction for any input you throw at it.

How Convolution Differs From Non-Linear Operations

To see why linearity matters, it helps to consider what non-linear operations look like. Squaring is a simple example: if x is 1 the output is 1, if x is 2 the output is 4, but if x is 3 the output is 9, not 5. Adding the inputs (1 + 2 = 3) does not produce the sum of the outputs (1 + 4 = 5). That failure of additivity is the hallmark of non-linearity.

Non-linear systems are harder to analyze because you can’t decompose inputs and recombine outputs. Airflow over a paper airplane’s wings, for instance, behaves non-linearly: throwing harder doesn’t just scale the flight path proportionally, it can make the plane loop or veer unpredictably. With a linear operation like convolution, doubling the input always doubles the output, and combining inputs always produces a predictable, additive result.

Convolution in Neural Networks

Convolutional neural networks (CNNs) use convolution layers that slide small filters across an image or signal, computing weighted sums at each position. This filter-scanning step is itself a linear operation. If you doubled every pixel value in an image, the raw convolution output would also double.

Neural networks need non-linearity to learn complex patterns, so they pair each convolution layer with an activation function. The most common choice, called ReLU, simply sets all negative values to zero while leaving positive values unchanged. That thresholding step is what introduces non-linearity into the network. The convolution operation on its own remains strictly linear. This distinction matters when designing or debugging networks, because the linear and non-linear parts play different roles: convolution detects patterns through weighted combinations, while the activation function allows the network to model relationships that a purely linear system never could.

Additional Properties of Convolution

Beyond linearity, convolution has several other useful algebraic properties. It is commutative: f convolved with g gives the same result as g convolved with f, so it doesn’t matter which function you flip and slide. It is associative: convolving f with g and then with h produces the same result as convolving f with the result of g convolved with h. And it is distributive over addition, which is essentially the additivity property restated. These properties together make convolution behave much like ordinary multiplication, which is part of why it’s so mathematically convenient. In fact, taking the Fourier transform converts convolution into actual multiplication, turning a potentially expensive integral into a simple pointwise product.