What Is a Sankey Diagram and How Does It Work?

A Sankey diagram is a type of flow chart where the width of each connecting line is proportional to the quantity it represents. If one flow carries ten times more than another, its line is ten times wider. This simple visual rule makes it immediately obvious where the biggest flows, losses, or transfers are happening in any system.

How a Sankey Diagram Works

Every Sankey diagram is built from two basic elements: nodes and links. Nodes are the blocks or bars representing stages, categories, or entities. Links are the flowing bands that connect one node to another. The critical feature is that each link’s width is strictly determined by the value it represents, so your eye is drawn to the largest quantities first without needing to read any numbers.

Imagine a diagram showing a country’s energy supply. On the left side, nodes represent energy sources: coal, natural gas, nuclear, renewables. On the right, nodes represent where that energy ends up: residential, commercial, industrial, transportation. The colored bands flowing between them show how much energy goes from each source to each destination. A thick band from natural gas to residential use tells you instantly that homes consume a large share of gas. A thin band from solar to industry tells you the opposite. The proportions do the talking.

This proportionality also means the diagram is self-checking. All flows entering a node should equal the flows leaving it (minus any losses), so viewers can verify at a glance whether the numbers add up. That built-in accountability is one reason Sankey diagrams are trusted for serious analytical work.

Where Sankey Diagrams Are Used

The most iconic application is energy flow analysis. Many countries and international agencies represent their entire energy systems as Sankey diagrams, tracing supply from raw sources all the way to end-use sectors. The U.S. Department of Energy, for example, publishes annual energy flow diagrams in this format. These diagrams make it easy to spot where energy is lost (often as waste heat) and where efficiency improvements would have the biggest impact. International energy standards recognize the Sankey diagram as a tool for energy review, consumption analysis, and performance planning.

Beyond energy, Sankey diagrams show up in website analytics (tracking how users navigate from page to page), supply chain management (following materials from suppliers through manufacturing to customers), budget visualization (showing where revenue comes from and where spending goes), and environmental science (mapping water usage, carbon emissions, or nutrient cycles through ecosystems). Any situation where you need to see “how much goes where” is a natural fit.

The Data Behind the Diagram

Building a Sankey diagram requires a surprisingly simple data structure. Each row of data defines a single connection: a source, a target, and a value. For example, one row might say “Natural Gas → Residential → 4.5 million BTUs.” Another might say “Coal → Industrial → 8.2 million BTUs.” Stack enough of these source-target-value rows together and the software draws the full diagram, sizing every link according to its value.

This simplicity is part of the appeal. You don’t need complex coordinate systems or specialized formatting. If you can organize your data into three columns (from, to, how much), you can generate a Sankey diagram.

Tools for Creating Sankey Diagrams

You can build Sankey diagrams with code or without it. On the code side, D3.js (a JavaScript library) has a dedicated Sankey plugin that gives full control over layout, colors, and interactivity. Python users commonly turn to Plotly or Matplotlib extensions. Google Charts offers a built-in Sankey chart type that works in a web browser with minimal setup.

If you’d rather skip the coding entirely, browser-based tools like SankeyMATIC let you type in your data as plain text and generate a diagram immediately, with no installation or extensions required. More robust business intelligence platforms like Tableau and Power BI include Sankey visualizations as well, often with drag-and-drop interfaces.

Origins of the Name

The diagram is named after Captain Matthew Henry Phineas Riall Sankey, an Irish engineer. In an 1898 article published in the Minutes of Proceedings of the Institution of Civil Engineers, Sankey introduced the first energy flow diagram to illustrate the thermal efficiency of a steam engine. His innovation was using arrow widths proportional to heat loss at each stage, so readers could immediately see where energy was being wasted. That core idea, letting width carry the data, has remained unchanged for over 125 years.

When a Sankey Diagram Is the Right Choice

Sankey diagrams excel when you want to show transfers between categories, especially when flows split, merge, or get lost along the way. They’re ideal for answering questions like “where does the majority go?” or “where are the biggest losses?” They work best with a manageable number of nodes, typically under 20 or so. With too many nodes and links, the diagram becomes a tangle of overlapping bands that’s harder to read than a simple table.

They’re less useful for showing change over time (a line chart does that better) or for comparing individual values precisely (bar charts win there). The strength of a Sankey is showing the structure of a system: how inputs become outputs, how resources get allocated, and where things disappear along the way.