What Is the Bellman-Ford Algorithm and How It Works

The Bellman-Ford algorithm is a method for finding the shortest path from a single starting point to every other point in a graph. It works by repeatedly checking every connection in the graph and updating distance estimates until it finds the optimal routes. Unlike some faster alternatives, it can handle graphs where connections have negative values, making it essential in scenarios like network routing where costs can decrease along certain paths.

The algorithm was developed independently by three researchers between 1956 and 1958: Lester Ford Jr. published the first version in a 1956 RAND Corporation note, Edward Moore contributed a queuing-based approach in 1957, and Richard Bellman formalized it using dynamic programming in 1958. It’s sometimes called the Bellman-Ford-Moore algorithm to credit all three contributors.

How the Algorithm Works

The core idea is surprisingly simple. You start by setting the distance to your source node as zero and the distance to every other node as infinity (meaning “unknown”). Then you repeatedly scan every edge in the graph, asking one question each time: “Can I get to this destination faster by going through this connection instead?” If yes, you update the distance. This update step is called “relaxation.”

More precisely, for each connection from node U to node V with a cost of W, the algorithm checks whether the known distance to V is greater than the known distance to U plus W. If it is, the distance to V gets replaced with that smaller value, and the algorithm records that the best way to reach V is through U. This is the relaxation formula at the heart of the algorithm.

The algorithm repeats this full scan of every edge exactly N-1 times, where N is the number of nodes in the graph. Why N-1? Because the longest possible shortest path in a graph with N nodes can pass through at most N-1 edges. Each round of scanning guarantees that paths using one additional edge are correctly computed. After the first round, all shortest paths that use just one edge are correct. After the second round, all paths using up to two edges are correct. By the time you finish N-1 rounds, every shortest path is locked in.

The Dynamic Programming Connection

Bellman-Ford is a textbook example of dynamic programming, a technique where you solve a big problem by building up solutions to smaller subproblems. Here, each subproblem is: “What is the shortest path from node V to the destination using at most I edges?” You solve the one-edge version first, then use those answers to solve the two-edge version, and so on.

The recurrence relation captures this neatly. The shortest path from V using up to I edges equals the minimum, across all neighbors X of V, of the connection cost from V to X plus the shortest path from X using up to I-1 edges. Each layer of the solution builds directly on the previous one, which is why the algorithm is guaranteed to find the correct answer as long as no negative-weight cycles exist.

Detecting Negative-Weight Cycles

A negative-weight cycle is a loop in the graph where the total cost of going around the loop is less than zero. If such a cycle exists on a path from your source to some destination, you could keep looping forever and drive the distance to negative infinity. In that case, no true “shortest path” exists.

Bellman-Ford has a built-in way to detect this. After completing all N-1 rounds of relaxation, you run one additional round. If any distance still decreases during that extra round, a negative-weight cycle is reachable from the source. This detection capability is one of the algorithm’s most important features, since many other shortest-path algorithms simply produce wrong answers when negative cycles are present.

How It Differs From Dijkstra’s Algorithm

Dijkstra’s algorithm solves the same problem (shortest paths from a single source) but takes a greedy approach: it always processes the closest unvisited node next, permanently locking in that node’s distance. This makes Dijkstra faster, but it relies on a critical assumption that no edge has a negative weight. If a negative edge exists, Dijkstra can lock in a distance too early and miss a shorter path discovered later.

Bellman-Ford doesn’t make that assumption. It checks every edge in every round, so a negative edge discovered later can still update a previously computed distance. The trade-off is speed. Dijkstra’s algorithm, with a good priority queue, runs in time proportional to (E + V) × log V, where E is the number of edges and V is the number of nodes. Bellman-Ford runs in time proportional to V × E, which is significantly slower on large graphs. For graphs with only positive weights, Dijkstra is the better choice. When negative weights are possible, Bellman-Ford is the safe option.

Researchers have also explored hybrid approaches that combine both algorithms. One such method uses Bellman-Ford to handle the negative edges first, then switches to Dijkstra for the rest, improving performance on graphs where negative edges are sparse.

Handling Undirected Graphs

Bellman-Ford is designed for directed graphs, where each connection has a specific direction. If you need to use it on an undirected graph (where connections go both ways), you replace each undirected edge with two directed edges pointing in opposite directions, both carrying the same weight. This works fine for positive or zero weights, but be careful: a single undirected edge with a negative weight effectively creates a negative-weight cycle, since you can traverse it back and forth endlessly to reduce cost. In practice, this means Bellman-Ford on undirected graphs only makes sense when all weights are non-negative, which removes its main advantage over Dijkstra.

Early Termination

One practical optimization can speed up Bellman-Ford considerably. During each round, you track whether any distance was actually updated. If a complete round finishes with zero updates, the algorithm has already found all shortest paths, and you can stop early without completing the remaining rounds. On many real-world graphs, shortest paths are much shorter than the theoretical maximum of N-1 edges, so the algorithm often converges well before the final round.

A further refinement tracks exactly which nodes were updated in the previous round and only examines edges connected to those nodes in the next round. Nodes that haven’t changed can’t contribute to new improvements, so skipping them saves work without affecting correctness.

Where Bellman-Ford Is Used

The most prominent real-world application is in computer network routing. The Routing Information Protocol (RIP), one of the oldest internet routing protocols still in use, is built directly on the Bellman-Ford algorithm. In this context, each router is a node, each link between routers is an edge, and the “distance” is typically a hop count or link cost. Each router shares its distance estimates with its neighbors, and neighbors update their own tables accordingly. This distributed version of Bellman-Ford runs continuously, allowing the network to adapt as links go up or down.

This class of routing protocols is called “distance-vector” routing precisely because each node maintains a vector (a list) of distances to every other node and shares it with direct neighbors. The approach scales naturally to large, decentralized networks where no single device has a complete view of the topology.

Beyond networking, Bellman-Ford appears in currency arbitrage detection (where negative cycles represent profitable trading loops), game development for pathfinding with penalty zones, and any domain where edge weights can legitimately be negative.

Performance and Complexity

The worst-case time complexity is O(V × E), since the algorithm performs V-1 rounds and examines every edge in each round. For dense graphs where the number of edges approaches V², this becomes O(V³). The space complexity is O(V), since you need to store a distance value and a predecessor for each node.

In the best case, with the early termination optimization, the algorithm can finish in a single round (O(E) time) if the source node’s direct neighbors are the only reachable nodes, or if the edges happen to be processed in an order that resolves all shortest paths immediately. This best case is rare but illustrates why the optimization is worth implementing.