What Is Z-Score Normalization? Definition and Formula

Z-score normalization is a method of rescaling data so that every value is expressed as a distance from the average, measured in standard deviations. The formula is simple: subtract the mean from your data point, then divide by the standard deviation. The result tells you how far above or below the average that value sits, using a universal unit that works the same way regardless of what you originally measured.

The Formula and How to Calculate It

The z-score formula is: z = (x − μ) / σ, where x is your data point, μ is the mean of your dataset, and σ is the standard deviation. You follow three steps: identify your value, the mean, and the standard deviation; subtract the mean from your value; then divide by the standard deviation.

Say a class has a mean test score of 85 with a standard deviation of 2, and a student scored 86. The z-score is (86 − 85) / 2 = 0.5. That student scored half a standard deviation above the class average. Now imagine a different class with a mean of 82 and a standard deviation of 4, where a student scored 74. The z-score is (74 − 82) / 4 = −2. That student fell two full standard deviations below the average, a much more significant gap.

What the Number Actually Tells You

A z-score of zero means the value is exactly at the mean. Positive z-scores sit above the mean, negative z-scores sit below it. The number itself tells you the distance in standard deviations: a z-score of 1.5 means 1.5 standard deviations above average, while −0.7 means seven-tenths of a standard deviation below.

This matters because raw numbers are often meaningless on their own. Knowing someone scored 74 on a test doesn’t tell you much. Knowing their z-score was −2 tells you they performed far below average relative to the group’s typical spread. The z-score converts any measurement, whether it’s test grades, heights, or stock prices, into the same standardized scale, making comparisons possible across completely different datasets.

If a patient has a z-score of 1 on a verbal memory test and −0.3 on a processing speed test, you can directly compare those results even though the raw scores use different scales. That kind of cross-comparison is impossible with raw numbers alone.

When the Data Isn’t Normally Distributed

You can calculate a z-score for any dataset, but the interpretation changes depending on the shape of the distribution. When data follows a bell curve (normal distribution), z-scores map neatly onto percentages: about 68% of values fall between −1 and +1, about 95% fall between −2 and +2, and roughly 99.7% fall between −3 and +3. Those percentage benchmarks only hold true when the distribution is approximately normal.

With skewed or irregularly shaped data, the z-score still tells you how many standard deviations a point is from the mean, but you can’t reliably convert that into a percentile. A z-score of −2 in a heavily skewed dataset doesn’t necessarily mean the value sits at the 2.3rd percentile the way it would with a bell curve.

How Outliers Affect the Results

Z-score normalization relies on the mean and standard deviation, and both of those statistics are sensitive to extreme values. A single extreme outlier can inflate the standard deviation, which compresses everyone else’s z-scores toward zero and makes real differences look smaller than they are. It can also drag the mean away from the center of the data, shifting every z-score in one direction.

That said, the mean and standard deviation are somewhat more resilient to outliers than the minimum and maximum values used in other scaling methods. Some analysts address this by trimming or “winsorizing” extreme values before normalizing, which caps outliers at a set threshold rather than letting them distort the entire scale.

Z-Scores in Machine Learning

In data science and machine learning, z-score normalization is one of the most common forms of feature scaling. The problem it solves is straightforward: if one feature in your data ranges from 0 to 1 and another ranges from 0 to 100,000, many algorithms will treat the larger-ranged feature as more important simply because its numbers are bigger.

Z-score normalization rescales every feature to have a mean of 0 and a standard deviation of 1, putting them on equal footing. This is especially important for algorithms that calculate distances between data points, such as k-nearest neighbors and k-means clustering. Support vector machines and neural networks also benefit from normalized inputs because it helps them converge faster during training. Without normalization, these models can produce poor or unstable results.

Z-Scores in Medicine

Pediatricians use z-scores to track whether a child’s height and weight are developing normally compared to global standards. The World Health Organization growth charts plot measurements against age and sex norms, with cutoff values at +2 and −2 standard deviations marking the boundaries of typical growth. A child whose weight-for-length falls below −2 (the 2nd percentile) is classified as having low weight-for-length. A child above +2 (the 98th percentile) is classified as having high weight-for-length. Short stature is defined as a length-for-age z-score below −2.

Bone density testing is another common medical application. If you’re a premenopausal woman or a man younger than 50, your bone density scan result is reported as a z-score, comparing your bone mineral density to the average for healthy people of your age, ethnicity, and sex. A z-score of −2.0 or lower signals low bone density that may be caused by medications or other conditions. (People over 50 or postmenopausal women receive a T-score instead, which compares their density to a young adult reference point rather than their age group.)

Z-Scores in Psychological Testing

Standardized psychological and cognitive tests rely heavily on z-scores to make raw results interpretable. A raw score on a memory test has no inherent meaning until you compare it to the broader population. Converting it to a z-score instantly reveals how the individual performed relative to the group average.

IQ scores are a familiar example of this principle in action. IQ tests are standardized with a mean of 100 and a standard deviation of 15, which means an IQ of 115 corresponds to a z-score of +1 (one standard deviation above average), and an IQ of 70 corresponds to a z-score of −2. Other psychological tests use variations of the same idea: stanine scores range from 1 to 9 with a mean of 5 and a standard deviation of 2, while sten scores range from 1 to 10 with a mean of 5.5 and a standard deviation of 2. All of these are just z-scores repackaged into friendlier scales.