An orthogonal projection is the closest point on a line, plane, or subspace to a given point, found by dropping a perpendicular from that point onto the target surface. Think of your shadow on the ground when the sun is directly overhead: the light rays hit the ground at a right angle, and your shadow is the orthogonal projection of your body onto the flat surface. The “orthogonal” part means the connection between the original point and its projection forms a 90-degree angle with the surface you’re projecting onto.
This idea shows up across mathematics, statistics, engineering, and computer graphics. At its core, it always answers the same question: what’s the nearest point on some surface or subspace to the thing I’m starting with?
The Core Idea in Plain Terms
Imagine you’re standing somewhere in a room and you want to find the spot on the floor directly beneath you. You’d drop a plumb line straight down, perpendicular to the floor. The spot where it lands is the orthogonal projection of your position onto the floor. The key property is that this spot is closer to you than any other point on the floor. If you tried to connect yourself to any other floor tile, that line would be longer than the straight-down one.
This is exactly what orthogonal projection does in math: given a point (or vector) and a target subspace, it finds the unique point on that subspace that minimizes the distance. The leftover, the gap between your original point and its projection, points in a direction completely perpendicular to the subspace. Formally, if you project a vector onto a subspace, the difference between the original vector and the projected one belongs to the “orthogonal complement,” meaning it’s at right angles to everything in the subspace.
Projecting One Vector Onto Another
The simplest case is projecting one vector onto another, something you’ll encounter early in a linear algebra or multivariable calculus course. Say you have two vectors, u and v. The projection of u onto v gives you the component of u that points in the same direction as v.
There are two versions of this. The scalar projection tells you how much of u lies along v, as a single number. You compute it by taking the dot product of u and v, then dividing by the length of v. The vector projection gives you an actual vector pointing along v with that length. The formula multiplies the scalar projection by the unit vector in the direction of v, which works out to: (u · v / |v|²) × v.
A quick example: if u points northeast and v points due east, the projection of u onto v captures just the eastward component of u, stripping away the northward part. The removed northward piece is perpendicular to v, which is what makes the projection orthogonal.
Projecting Onto a Subspace
Projecting onto a single vector is a special case. More generally, you might project onto an entire plane or higher-dimensional subspace. The principle stays the same: find the point in the subspace that’s closest to your original vector, with the error (the leftover) perpendicular to the subspace.
When the subspace is described by the columns of a matrix A, the projection of a vector b onto that column space is given by the projection matrix P = A(AᵀA)⁻¹Aᵀ. Multiplying P by b gives you the projected vector. The remaining piece, b minus Pb, lives in the orthogonal complement, meaning it’s perpendicular to every vector in the column space. The matrix (I – P) is itself a projection matrix that captures this perpendicular remainder.
Two Key Properties of Projection Matrices
Orthogonal projection matrices have two defining algebraic properties that distinguish them from other kinds of transformations.
- Idempotency (P² = P): If you project something that’s already been projected, nothing changes. This makes intuitive sense. Once your shadow is on the ground, projecting the shadow onto the ground again leaves it exactly where it is.
- Symmetry (Pᵀ = P): The matrix equals its own transpose. This symmetry is what makes the projection orthogonal rather than oblique. An oblique projection can also be idempotent, but it won’t be symmetric, because it drops onto the subspace at a slanted angle rather than a right angle.
Any matrix that satisfies both of these conditions is an orthogonal projection matrix. If a matrix is idempotent but not symmetric, it performs an oblique projection instead.
Orthogonal vs. Oblique vs. Perspective Projection
In an orthogonal (also called orthographic) projection, the projection lines hit the target surface at right angles. Every point drops straight down onto the surface along parallel lines. This preserves true sizes and shapes, which is why engineering drawings and architectural blueprints rely on orthographic views. Each view shows exactly two dimensions at their real measurements.
An oblique projection also uses parallel projection lines, but they hit the surface at an angle other than 90 degrees. The result still shows the object, but distances and angles can be distorted in ways that don’t reflect the real geometry.
A perspective projection is different entirely. Here the projection lines converge at a single point (the viewer’s eye), so objects farther away appear smaller. This is how cameras and human vision work. It looks realistic, but equal distances on the object no longer appear equal in the drawing. That’s why engineering and manufacturing fields prefer orthographic projection for technical work where exact dimensions matter.
Why It Matters in Statistics and Data Science
Orthogonal projection isn’t just a geometry exercise. It’s the mathematical backbone of some widely used techniques in statistics and machine learning.
Linear Regression
When you fit a line (or plane, or hyperplane) through data using ordinary least squares, you’re performing an orthogonal projection. The predicted values are the projection of the observed data onto the column space of your predictor variables. The residuals, the differences between observed and predicted values, are perpendicular to that column space. This is why the method minimizes the sum of squared errors: orthogonal projection always finds the closest point on the subspace, and “closest” in this context means the smallest possible total squared distance.
The least squares solution comes directly from the projection formula. You solve AᵀAx = Aᵀy, which produces the coefficients that make Ax the orthogonal projection of y onto the column space of A. When the columns of A are linearly independent, the unique solution is x = (AᵀA)⁻¹Aᵀy.
Principal Component Analysis
PCA, one of the most common dimensionality reduction techniques, is built on orthogonal projection. It finds the directions in your data that capture the most variance, called principal components, and projects the full dataset onto those directions. The first principal component points in the direction of maximum variance. The second captures the most remaining variance while being perpendicular to the first, and so on.
When you reduce a dataset from, say, 100 features down to 10 principal components, you’re orthogonally projecting each data point from a 100-dimensional space onto a 10-dimensional subspace. The projection is chosen so that the subspace retains as much of the original variation in the data as possible. This lets you compress data, remove noise, and visualize high-dimensional patterns in two or three dimensions.
Signal Processing and Decomposition
Orthogonal projection also appears whenever you decompose a signal into simpler components. A Fourier series, for instance, breaks a complex signal into a sum of sine and cosine waves at different frequencies. Each coefficient in that sum is the orthogonal projection of the original signal onto one of those basis functions. The basis functions are orthogonal to each other (their overlap is zero), which means each coefficient captures a completely independent piece of the signal with no redundancy. The same principle applies to other decomposition methods like Fourier-Legendre and Fourier-Bessel series, where the choice of basis functions depends on the geometry or physical constraints of the problem.
In every one of these applications, the underlying idea is identical to the shadow on the floor: find the component of something that lives in a particular subspace, with the leftover pointing in a completely perpendicular direction. That geometric simplicity is what makes orthogonal projection one of the most broadly useful tools in applied mathematics.

