What Is a Jacobian? The Derivative in Higher Dimensions

A Jacobian is a matrix of partial derivatives that describes how a function with multiple inputs and multiple outputs changes at any given point. If you have a function that takes in two variables and spits out two variables, the Jacobian is a 2×2 grid of numbers showing how each output responds to a small change in each input. It’s one of the most widely used tools in calculus, physics, engineering, and machine learning.

The Core Idea

In single-variable calculus, the derivative tells you the rate of change of one output with respect to one input. The Jacobian extends this idea to functions where you have several inputs and several outputs at once. Suppose you have a function that maps two inputs (u, v) to two outputs (x, y). The Jacobian matrix arranges all four possible partial derivatives into a grid: how x changes with u, how x changes with v, how y changes with u, and how y changes with v.

For a function with n inputs and m outputs, the Jacobian is an m×n matrix. Each row corresponds to one output, and each column corresponds to one input. Every entry in the matrix is a partial derivative evaluated at a specific point, which means the Jacobian itself changes from point to point. It’s a local snapshot of how the function behaves right there.

The Best Linear Approximation

The Jacobian matrix serves as the best linear approximation of a nonlinear function near a specific point. This is the multivariable version of the idea that a derivative gives you the tangent line. If you zoom in close enough to any smooth function, it starts to look like a simple linear transformation, and the Jacobian is the matrix that defines that transformation.

This property is heavily used in engineering and control theory. Most real systems are nonlinear: the equations governing a drone’s flight, a chemical reaction, or a power grid don’t follow straight lines. But engineers routinely approximate these systems as linear near a chosen operating point by computing the Jacobian there. The resulting linear model, called a Jacobian linearization, is accurate as long as the system doesn’t stray too far from that point. This lets engineers apply the powerful, well-understood tools of linear algebra to problems that would otherwise be intractable.

The Jacobian Determinant and Coordinate Changes

When the Jacobian matrix is square (same number of inputs and outputs), you can take its determinant. This single number, often called “the Jacobian” by itself, tells you how much the function stretches or compresses space at that point. A Jacobian determinant of 3 means a tiny patch of area gets tripled when you apply the transformation. A determinant of 0.5 means it gets halved.

This is exactly what you need when changing variables in a multiple integral. In single-variable calculus, a u-substitution requires multiplying by du/dx. In two or three dimensions, the Jacobian determinant plays the same role. The change-of-variables formula says that to convert an integral from one coordinate system to another, you multiply the integrand by the absolute value of the Jacobian determinant. The absolute value matters because areas and volumes are always positive, even if the transformation reverses orientation.

The most common example is converting from Cartesian to polar coordinates, where x = r cos(θ) and y = r sin(θ). The Jacobian determinant works out to r, which is why area integrals in polar coordinates include that familiar extra factor of r: dA = r dr dθ. For spherical coordinates, the Jacobian determinant gives ρ² sin(φ), producing the volume element dV = ρ² sin(φ) dρ dφ dθ. These aren’t arbitrary conventions. They come directly from computing how much the coordinate transformation stretches space at each point.

What a Zero Determinant Means

When the Jacobian determinant equals zero at a point, the transformation collapses at least one dimension there. In mathematical terms, the function is no longer locally invertible: you can’t uniquely reverse the mapping. In physical terms, the consequences depend on the system.

In robotics, a zero Jacobian determinant signals a singularity where the robot arm loses the ability to move in some direction. For a two-link robot arm, this happens when the arm is fully extended or fully folded. At these configurations, no combination of joint motions can produce movement in certain Cartesian directions, and attempting to maintain a desired motion near a singularity would require impossibly fast joint speeds. Identifying these singularities is critical for path planning: you design the robot’s movements to avoid configurations where the Jacobian breaks down.

Robotics and Velocity Mapping

One of the most concrete applications of the Jacobian is in robotic arms. A robot arm has joints that rotate or slide, and the position of the hand (the end-effector) depends on all the joint angles through a set of nonlinear equations called forward kinematics. The Jacobian of this mapping relates joint velocities to end-effector velocities. If you know how fast each joint is spinning, multiplying that velocity vector by the Jacobian tells you how fast and in what direction the hand is moving.

You can think of the Jacobian as encoding the sensitivity of the hand’s motion to each joint’s motion. One column of the Jacobian represents the contribution of a single joint: if only that joint moves, the corresponding column tells you what direction and speed the hand follows. The full hand velocity is the sum of all these contributions. This relationship is essential for control. If you want the hand to move along a specific path, you invert the Jacobian to figure out what joint velocities you need, a process called resolved-rate control.

The Jacobian in Machine Learning

Neural networks rely on the Jacobian during training, though it’s rarely called by that name in introductory courses. Backpropagation, the algorithm used to train virtually all neural networks, computes how the network’s loss function changes with respect to each weight. For networks with multiple layers, this requires tracking how the output of one layer affects the output of the next, all the way through the network. The matrix of partial derivatives connecting one layer’s outputs to another layer’s outputs is a Jacobian.

The gradients used to update each weight depend on these Jacobian terms multiplied together across layers. This chain of multiplications creates a well-known practical problem. If the Jacobian matrices at each layer tend to shrink values (eigenvalues less than 1), the gradient signal decays exponentially as it propagates backward through many layers, a problem called vanishing gradients. If they tend to amplify values, gradients explode. Much of modern neural network architecture design, from residual connections to careful weight initialization, exists specifically to keep these layer-to-layer Jacobians well-behaved so that training remains stable.

Putting It Together

Across all these applications, the Jacobian does the same fundamental thing: it captures how a multivariable function transforms small changes in its inputs into changes in its outputs. In calculus, it corrects for distortion when switching coordinate systems. In robotics, it translates joint motions into hand motions. In machine learning, it propagates error signals backward through layers. In control engineering, it linearizes a messy nonlinear system into something manageable. The math is identical each time. What changes is the interpretation.