What Is Kruskal’s Algorithm? Greedy MST Explained

Kruskal’s algorithm is a method for finding the minimum spanning tree of a graph, meaning it connects all points (vertices) in a network using the cheapest possible set of connections (edges) without forming any loops. Published by mathematician Joseph Kruskal in 1956, it works by sorting all edges from lightest to heaviest and adding them one at a time, skipping any edge that would create a cycle. The result is a tree that spans every vertex with the lowest total cost.

The Problem It Solves

Imagine a country needs to lay communication lines between several cities. Every possible connection between two cities has a different cost. The goal is to connect all cities, directly or indirectly, while spending as little as possible. This is the minimum spanning tree problem: given a weighted graph, find the subset of edges that connects every vertex with the minimum total weight.

A spanning tree must include every vertex, use exactly one fewer edge than the number of vertices, and contain no cycles. Among all possible spanning trees of a graph, the minimum spanning tree (often abbreviated MST) is the one with the smallest sum of edge weights. Kruskal’s algorithm guarantees it will find this optimal tree.

How the Algorithm Works Step by Step

Kruskal’s algorithm follows a straightforward greedy strategy. It always picks the cheapest available edge, as long as that edge doesn’t create a cycle. Here’s the process:

Start with isolated vertices. Every vertex begins in its own group (component). If the graph has 6 vertices, you start with 6 separate components.
Sort all edges by weight. Arrange every edge in the graph from lowest cost to highest cost.
Process edges in order. Take the cheapest edge. If it connects two vertices that are in different components, add it to the tree and merge those two components into one. If both vertices are already in the same component, skip the edge because adding it would form a cycle.
Repeat until one component remains. Keep processing edges until all vertices belong to a single connected component. At that point, you have your minimum spanning tree.

The key insight is the cycle check. Every time you consider an edge, you ask: “Are these two vertices already connected through edges I’ve already picked?” If yes, adding this edge would create a loop, so you skip it. If no, the edge safely bridges two separate pieces of the growing tree.

How Cycle Detection Works

The cycle check relies on a data structure called Union-Find (also known as a disjoint-set data structure). It tracks which vertices belong to which component and supports two operations: checking whether two vertices are in the same group, and merging two groups into one.

At the start, each vertex is its own group. When the algorithm adds an edge between vertices A and B, it merges their groups with a “union” operation. When it considers a new edge, it runs a “find” operation on both endpoints. If they share the same root (the same group leader), they’re already connected, and the edge is skipped. If they have different roots, the edge is safe to add.

This structure makes the cycle check extremely fast. With two common optimizations called path compression and union by rank, each find or union operation runs in nearly constant time. Without Union-Find, you’d need to search through the partially built tree every time you considered an edge, which would be far slower.

Why the Greedy Approach Works

It might seem too simple: always pick the cheapest edge and hope for the best. But Kruskal’s algorithm is provably correct, and the justification comes from something called the cut property.

The cut property says: if you divide the vertices into two groups and look at all edges crossing between those groups, the lightest crossing edge is guaranteed to belong to some minimum spanning tree. When Kruskal’s algorithm considers an edge that connects two different components, that edge is the lightest one crossing the boundary between those components (because all lighter edges were already processed). So every edge the algorithm adds satisfies the cut property, and the final result is a valid MST.

More concretely, suppose the algorithm picked an edge that wasn’t in the true MST. You could swap it with a heavier edge in the MST that crosses the same cut, producing a tree with equal or lower weight. Since the MST already has minimum weight, the two trees must weigh the same. Either way, you end up with a minimum spanning tree.

Time and Space Complexity

The most expensive step in Kruskal’s algorithm is sorting the edges. Sorting E edges takes O(E log E) time. After sorting, the algorithm processes each edge once, performing Union-Find operations that run in nearly O(1) each, so the total for that phase is effectively O(E). The overall time complexity is dominated by the sort: O(E log E), which is equivalent to O(E log V) since E can be at most V².

For space, the algorithm stores the edge list (O(E)) and the Union-Find structure (O(V)). The total memory requirement is O(E + V), which for most graphs simplifies to O(E) since a connected graph always has at least V − 1 edges.

Kruskal’s vs. Prim’s Algorithm

Prim’s algorithm solves the same problem but takes a different approach. Instead of sorting all edges globally, Prim’s starts from a single vertex and grows the tree outward, always adding the cheapest edge that connects a new vertex to the existing tree.

The practical difference comes down to graph density. Kruskal’s algorithm works best on sparse graphs, where the number of edges is relatively low compared to the number of vertices. Sorting a short edge list is fast, and the Union-Find operations add minimal overhead. Prim’s algorithm, on the other hand, performs better on dense graphs (many edges) because it uses a priority queue tied to the adjacency list and avoids sorting every edge upfront.

If you’re working with a graph where vertices have only a few connections each, Kruskal’s is typically the better choice. If nearly every vertex connects to every other vertex, Prim’s tends to be more efficient.

Real-World Applications

The most natural application is network design. When telecommunications companies plan fiber optic routes between cities, they model the problem as a weighted graph: cities are vertices, possible cable routes are edges, and weights represent the cost of laying each cable. Running Kruskal’s algorithm on this graph produces the cheapest way to connect every city. Studies evaluating this approach on real fiber optic networks have found it significantly reduces cost compared to unoptimized layouts.

The same logic applies to electrical grid planning, road networks, and computer network infrastructure, where nodes represent devices like routers and edges represent physical connections between them. Any scenario where you need to connect a set of points at minimum total cost is a candidate for a minimum spanning tree algorithm.

Beyond infrastructure, Kruskal’s algorithm appears in clustering problems. If you run the algorithm but stop before all components merge, the remaining separate components form natural clusters of closely related points. This technique is used in image segmentation, taxonomy, and data analysis.

Behavior on Disconnected Graphs

If the input graph isn’t fully connected (some vertices have no path between them), Kruskal’s algorithm can’t produce a single spanning tree. Instead, it produces a minimum spanning forest: a collection of minimum spanning trees, one for each connected component of the graph. The algorithm runs the same way, processing edges from cheapest to heaviest, but it finishes with multiple components rather than one. This is useful when you want to optimize connections within each separate cluster of a network without requiring that every cluster be linked.