What Is Manhattan Distance? Meaning, Formula & Uses

Manhattan distance is the total distance between two points when you can only travel along horizontal and vertical lines, like a taxi navigating a city grid. Instead of cutting diagonally across blocks, you measure how far you’d go sideways plus how far you’d go up or down. For two points on a flat grid, the formula is simply the absolute difference in their x-coordinates plus the absolute difference in their y-coordinates.

Why It’s Called Manhattan Distance

The name comes from the grid layout of Manhattan’s streets. If you’re standing at the corner of 1st Avenue and 1st Street and need to reach 4th Avenue and 5th Street, you can’t walk through buildings. You walk three blocks east and four blocks north (or some combination), covering seven blocks total. That total, seven blocks, is the Manhattan distance between the two points.

The underlying math was first proposed by Hermann Minkowski around 1900, but the “taxicab” label didn’t appear until 1952, when mathematician Karl Menger created a geometry exhibit at the Museum of Science and Industry in Chicago. A booklet accompanying the exhibit, titled “You Will Like Geometry,” introduced the term “taxicab geometry.” You’ll also see Manhattan distance called city block distance or the L1 norm.

How to Calculate It

Take two points on a grid. Point A is at coordinates (1, 1) and Point B is at (4, 5). Subtract the x-values and ignore the sign: |1 − 4| = 3. Do the same for the y-values: |1 − 5| = 4. Add them together: 3 + 4 = 7. The Manhattan distance is 7 units.

This works in higher dimensions too. If your points have three coordinates (x, y, z), you just add a third term. With ten coordinates, you add ten terms. The pattern is always the same: take the absolute difference for each dimension, then sum them all up.

Manhattan Distance vs. Euclidean Distance

Euclidean distance is the straight line between two points, the “as the crow flies” measurement. Using the same example above, the Euclidean distance between (1, 1) and (4, 5) is 5 units, found by the Pythagorean theorem. The Manhattan distance is 7 units. In flat space, Euclidean distance is always less than or equal to Manhattan distance, because a straight line is the shortest path when nothing blocks you.

The choice between the two depends on how movement works in your problem. Manhattan distance fits situations where travel is restricted to a grid: city streets, chessboard-like layouts, circuit board routing. Euclidean distance fits open spaces where diagonal movement costs the same as horizontal or vertical. If you’re measuring how far a drone flies across a field, Euclidean distance makes sense. If you’re measuring how far a delivery driver travels through a city, Manhattan distance is closer to reality.

What Makes It a True Distance Metric

Manhattan distance satisfies three rules that qualify it as a proper mathematical metric. First, it’s never negative, and it equals zero only when both points are the same location. Second, it’s symmetric: the distance from A to B is the same as from B to A. Third, it obeys the triangle inequality, meaning the distance from A to C is never greater than the distance from A to B plus B to C. You can’t find a shortcut by adding a detour. These properties matter because many algorithms rely on distance measures behaving predictably, and Manhattan distance always does.

Uses in Machine Learning

Manhattan distance shows up frequently in machine learning, especially in algorithms that classify data by finding the nearest neighbors. The K-Nearest Neighbors algorithm, for example, needs a way to measure how “close” two data points are. Manhattan distance works well here because it sums up differences across every feature independently, which makes it easy to interpret: each feature’s contribution to the total distance is straightforward to see.

It’s often preferred over Euclidean distance when the data has many features (high-dimensional data). In high-dimensional spaces, all distance metrics start to struggle with a phenomenon called the curse of dimensionality, where the gap between the nearest and farthest points shrinks, making it harder to distinguish neighbors from distant points. Research has shown that Euclidean distance becomes less effective as dimensionality increases. Manhattan distance doesn’t eliminate this problem, but it tends to hold up better in practice because squaring differences (as Euclidean distance does) amplifies the effect of outlier values in any single dimension.

Manhattan distance also works well with discrete or categorical data that’s been converted to numbers. If you’re comparing survey responses on a 1-to-5 scale across dozens of questions, Manhattan distance gives you a clean total of how much two respondents differ across all questions.

Uses in Pathfinding and Game Development

In grid-based pathfinding, Manhattan distance serves as a heuristic, an estimate of how far you still need to go. The A* search algorithm, widely used in video games and robotics, needs this kind of estimate to find the shortest path efficiently. On a square grid where movement is limited to four directions (up, down, left, right), Manhattan distance is the ideal heuristic because it exactly represents the minimum number of moves needed to reach the goal if nothing is in the way.

Using Manhattan distance as a heuristic guarantees that A* won’t overestimate the remaining distance, which is a requirement for the algorithm to find the true shortest path. If diagonal movement were allowed, you’d switch to a different heuristic, but for four-directional grids, Manhattan distance is the standard choice.

Connection to the Broader Family of Distance Metrics

Manhattan distance belongs to a family called Minkowski distances, which are defined by a single parameter, p. When p equals 1, you get Manhattan distance. When p equals 2, you get Euclidean distance. As p increases toward infinity, you get Chebyshev distance, which only considers the largest single-dimension difference. Thinking of these as a family helps clarify when to use each one: lower values of p treat all dimensions more equally, while higher values let the biggest single difference dominate.