What Is a Centroid? From Triangles to Data Science

A centroid is the exact center point of a shape, calculated as the average position of all the points within it. Think of it as the spot where you could balance the shape perfectly on the tip of a pencil. While the concept comes from geometry, centroids show up everywhere: in data science algorithms, structural engineering, population mapping, and computer vision.

The Core Idea

The centroid of any shape is the point whose coordinates equal the arithmetic average of the coordinates of every point in that shape. For a flat, uniform cutout of cardboard, the centroid is the center of mass, the place where gravity pulls equally in all directions. If the material has the same density throughout, the centroid depends only on geometry.

For a set of individual points (say, five dots on a graph), the centroid is simply the average of all the x-coordinates paired with the average of all the y-coordinates. If your points are (2, 4), (6, 8), and (4, 3), the centroid sits at (4, 5). The same logic scales to three dimensions or higher.

When shapes have continuous boundaries rather than discrete points, calculus replaces simple averaging. The centroid’s x-coordinate equals the integral of x over the entire area, divided by the total area, and likewise for y. But the underlying principle never changes: you’re finding the average location.

Centroid of a Triangle

Triangles have the cleanest centroid rule in geometry. If you know the three vertices, the centroid is just the average of those three coordinate pairs. For a triangle with corners at (x₁, y₁), (x₂, y₂), and (x₃, y₃), the centroid lands at ((x₁ + x₂ + x₃)/3, (y₁ + y₂ + y₃)/3).

There’s a visual way to find it too. A median is a line drawn from one corner of a triangle to the midpoint of the opposite side. Every triangle has three medians, and they always intersect at a single point: the centroid. That intersection sits exactly 2/3 of the way from each vertex toward the opposite midpoint. This 2/3 ratio holds for every triangle, regardless of its shape.

Other Polygons and Irregular Shapes

For simple polygons with straight edges (rectangles, pentagons, irregular quadrilaterals), you can find the centroid using only the coordinates of the vertices, no calculus needed. When the shape has uniform density, the formulas break down into sums involving adjacent vertex pairs and the polygon’s total area. Most geometry software and CAD tools handle this automatically.

For shapes with curved boundaries, or for regions where density isn’t uniform, you return to the integral form. A heavier region pulls the centroid toward itself, just as a heavy end of a seesaw pulls the balance point in its direction. In these cases the centroid may not coincide with the geometric center you’d estimate by eye.

One detail worth noting: the centroid of a shape doesn’t have to be inside the shape. A donut, a crescent, or an L-shaped polygon can have a centroid that falls in empty space.

Why Engineers Care About Centroids

In structural engineering, the centroid of a beam’s cross-section determines its neutral axis, the imaginary line along the beam where material neither stretches nor compresses during bending. Fibers above the neutral axis elongate under load, while fibers below it shorten. Getting this location wrong means miscalculating how a beam deflects or where stress concentrates, which directly affects whether a structure is safe and comfortable to use.

Excessive deflection in beams and floors causes visible sagging and occupant discomfort, so building codes set strict limits. Those limits are checked using formulas that depend on the centroid and the moment of inertia (a measure of how the cross-section’s area is distributed around the centroid). Engineers reference tables of pre-calculated centroids for standard shapes like I-beams, channels, and angles, then combine them for complex cross-sections.

Centroids in Data Science and Clustering

If you’ve encountered the term centroid in a machine learning context, it almost certainly involves K-means clustering. K-means is one of the most widely used algorithms for grouping data points into clusters, and centroids are its central mechanism.

The algorithm works by placing k centroids (where k is the number of groups you want) and then repeating two steps. First, every data point gets assigned to whichever centroid is closest. Second, each centroid moves to the mean position of all the points assigned to it. These two steps alternate until the centroids stop moving significantly. At that point, each centroid represents the “average member” of its cluster.

A data point belongs to whatever cluster has the nearest centroid. This makes centroids function as representatives or prototypes: if you want to summarize what a typical member of a group looks like, the centroid gives you that summary in a single set of coordinates. The coordinates don’t have to be spatial. They can be features like age, income, and purchase frequency, making centroids useful for customer segmentation, image compression, and document categorization.

Population Centroids and Geography

The U.S. Census Bureau calculates a “center of population” for the entire country and for each state after every census. The concept is essentially a weighted centroid: imagine a flat, weightless map of the United States, and place one identical weight at the location of every person. The point where that map balances is the population centroid.

This isn’t the geographic center of the country. It’s the average location of all residents, so it shifts over decades as population moves. Tracking it reveals migration trends at a glance. The Census Bureau publishes latitude and longitude coordinates for each state’s population centroid alongside its census data, giving demographers and planners a compact summary of where people actually live.

Computer Vision and Object Tracking

When software needs to track a moving object in video, whether it’s a ball in a sports broadcast, a cell under a microscope, or particles in a fluid, it often starts by finding each object’s centroid in every frame. The centroid gives a single reference point for position, and comparing that point across frames yields speed and trajectory.

NASA developed an algorithm for stereo imaging velocimetry that identifies particle edges in an image, draws a bounding box around each particle, and then calculates the centroid using an intensity-weighted center of mass. Pixels that are brighter (indicating the core of the particle) contribute more to the centroid position than dimmer edge pixels. This approach yields sub-pixel accuracy, meaning the calculated centroid position is more precise than the individual pixels in the image, which is critical for accurate velocity measurements.

The same principle applies in biology. Researchers studying cell movement track the centroid of each cell over time to quantify migration speed and direction. In one study of mouse embryonic fibroblasts, scientists measured the position of internal cell structures relative to the cell’s centroid to understand how cells organize themselves on different surfaces. The centroid served as a consistent, shape-independent reference point regardless of whether the cells were round, triangular, or irregularly shaped.

Quick Reference for Common Shapes

Rectangle: intersection of the diagonals (the obvious center)
Triangle: average of the three vertex coordinates, located at the intersection of the medians
Circle: the geometric center
Semicircle: on the axis of symmetry, but shifted toward the flat edge (about 4r/3π from the flat side, where r is the radius)
Irregular polygon: calculated from vertex coordinates using a summation formula involving adjacent vertex pairs

For symmetric shapes, the centroid always lies on every axis of symmetry. A shape with two axes of symmetry (like a rectangle or ellipse) has its centroid right where those axes cross. For shapes with only one axis of symmetry, you know the centroid sits somewhere along that line but still need to calculate exactly where.