How to Interpret a Dendrogram in Hierarchical Clustering

A dendrogram is a tree-shaped diagram that shows how data points (or groups of data points) are related to each other based on similarity. Reading one comes down to understanding three things: what the branches connect, how tall they are, and where you draw a horizontal line to define your clusters. Once you know those basics, the rest falls into place quickly.

The Basic Anatomy

A dendrogram has two axes. The horizontal axis lists the individual items being compared, sometimes called “leaves.” These might be genes, survey respondents, product samples, species, or any set of things you want to group. The vertical axis represents distance or dissimilarity. Every time two items (or two groups) merge, the merge point sits at a height on the vertical axis that corresponds to how different those items are from each other.

The merges look like upside-down U shapes. Each U connects two branches, and the height of that U tells you the distance between whatever is being joined. Short Us near the bottom of the diagram mean the items being merged are very similar. Tall Us near the top mean the groups being merged are quite different. This is the single most important thing to pay attention to: the vertical height of each connection, not the left-to-right arrangement of the leaves.

If your dataset has 30 or fewer items, each leaf in the dendrogram typically corresponds to a single data point. With larger datasets, some leaves may represent multiple data points that have been compressed together for readability.

Why Left-to-Right Order Can Be Misleading

One of the most common mistakes is assuming that two leaves sitting next to each other on the horizontal axis are the most similar items in the dataset. That’s not necessarily true. The horizontal position of leaves is somewhat arbitrary because at every merge point, the two branches can be flipped left or right without changing the meaning of the tree. Think of each internal node as a hinge: you can rotate it freely and the dendrogram remains equally valid.

The only reliable indicator of similarity is the height at which two items first connect through their shared branch. Two leaves on opposite ends of the horizontal axis might actually merge at a very low height (meaning they’re quite similar), while two leaves sitting side by side might not share a direct low-level connection at all. Always trace the branches upward and check the vertical height of the join.

Choosing the Number of Clusters

To extract distinct groups from a dendrogram, you draw a horizontal line across it at some height. Every branch that the line crosses becomes its own cluster, and everything below each crossing is grouped together. The lower you cut, the more clusters you get. The higher you cut, the fewer.

The most practical approach is to look for the largest vertical gap between successive merges. If there’s a long stretch of height where no new merges happen, that’s a natural breaking point: the items below that gap are meaningfully more similar to each other than they are to anything above it. Drawing your horizontal line through that gap gives you a clustering that reflects a genuine structure in the data rather than an arbitrary cutoff.

For example, imagine a dendrogram where several items merge at heights between 1 and 3, then nothing merges again until height 8, and the final merge happens at 10. That big jump from 3 to 8 suggests a natural division. Cutting at height 5 or 6 would capture the tight, low-level groups as separate clusters before they get lumped together at the higher, less meaningful merge.

How Linkage Methods Change the Shape

The shape of a dendrogram depends heavily on how “distance between two clusters” is defined during the clustering process. This choice is called the linkage method, and different methods produce noticeably different trees from the same data.

Single linkage measures the distance between the two closest members of each cluster. This tends to produce long, chain-like dendrograms where items get added one at a time to an existing group. It’s good at detecting elongated or irregular cluster shapes but can merge groups that shouldn’t be merged if they happen to have one pair of nearby points.
Complete linkage measures the distance between the two farthest-apart members of each cluster. This produces more compact, evenly sized groups and avoids the chaining problem, but it can split natural clusters that have even a few outlying points.
Average linkage takes the average of all pairwise distances between members of the two clusters. It’s a compromise between single and complete linkage, producing moderately compact groups.
Ward’s method works differently from the others. Instead of measuring distances directly, it merges whichever two clusters produce the smallest increase in overall variance. This tends to create the most evenly sized, spherical clusters and is one of the most widely used methods in practice.

If you’re comparing dendrograms or trying to reproduce someone else’s analysis, knowing which linkage method was used is essential. The same dataset can look dramatically different depending on this choice.

How Distance Metrics Matter

Before clusters are even formed, you need a way to measure how far apart individual data points are. The two most common options are Euclidean distance (straight-line distance between points) and Manhattan distance (the sum of absolute differences along each dimension, like navigating a city grid). The choice of distance metric influences the shape, volume, and orientation of the resulting clusters. Two data points that appear close under one metric can appear far apart under another, so the dendrogram’s structure shifts accordingly.

Euclidean distance works well when your variables are measured on similar scales and you care about overall magnitude of difference. Manhattan distance is more robust when your data has outliers, since it doesn’t square the differences. For specialized applications like comparing gene expression profiles or text documents, other metrics like correlation-based distances are common. The key point for interpretation is that the “height” axis on your dendrogram is always measured in whatever distance metric was chosen, so the absolute numbers only make sense in that context.

Checking Whether the Dendrogram Is Reliable

Not every dendrogram faithfully represents the actual structure in your data. The cophenetic correlation coefficient is a standard way to check this. It compares the original distances between all pairs of data points to the distances implied by the dendrogram (the height at which each pair of points first merges). A cophenetic correlation close to 1.0 means the dendrogram is a faithful representation of the underlying distances. Lower values mean the tree is distorting the relationships, and you might want to try a different linkage method or distance metric.

There’s no universal cutoff for “good enough,” but values above 0.7 or 0.8 are generally considered acceptable. If your cophenetic correlation is low, the visual groupings in the dendrogram may not reflect real patterns, and conclusions drawn from it should be treated cautiously.

Dendrograms in Biology vs. Data Science

If you’ve seen dendrograms in two very different contexts and wondered whether they mean the same thing, the answer is: mostly, but not entirely. In biology, dendrograms often appear as phylogenetic trees, where the leaves represent species or genes and the branching pattern represents evolutionary relationships. The branch lengths reflect estimated evolutionary time or genetic change. In data science and statistics, dendrograms represent hierarchical clustering, where the leaves are data points (customers, samples, documents) and the branch heights represent statistical distance.

Both share the same visual structure: known items at the leaves, a branching pattern showing relationships, and branch lengths encoding some measure of difference. The interpretive logic is the same: shorter connections mean greater similarity. The difference is mainly in what the distances represent and how they’re validated. Phylogenetic trees are often assessed using bootstrap resampling, where the analysis is repeated many times with slightly varied data, and the percentage of times a given branch appears is recorded as a confidence measure. Hierarchical clustering trees can be validated the same way, checking whether the same groupings appear consistently when the data is resampled.

Circular and Radial Layouts

When a dendrogram has hundreds or thousands of leaves, the standard rectangular layout becomes hard to read because the labels crowd together. A radial or circular layout places the root at the center and fans the leaves outward along the outer ring. This produces more uniform spacing between leaf nodes and makes better use of available space. The interpretation is identical to a standard dendrogram: connections closer to the outer ring (farther from the center) represent low-distance, high-similarity merges, while connections near the center represent high-distance merges. If you encounter a circular dendrogram, just remember that “height” now runs from the outside inward rather than from bottom to top.