RMSE, or root mean square error, is a number that tells you how far off a model’s predictions are from the actual values, on average. If a model predicts home prices in dollars, the RMSE is also in dollars, giving you an intuitive sense of how big the typical error is. It’s one of the most widely used metrics for evaluating prediction accuracy in statistics, data science, and machine learning.
How RMSE Is Calculated
The calculation follows four steps, each building on the last:
- Find the errors (residuals). For each data point, subtract the predicted value from the actual observed value. If a home sold for $250,000 and the model predicted $245,000, the error is $5,000.
- Square each error. This eliminates negative signs so that overestimates and underestimates don’t cancel each other out. It also amplifies larger errors, which becomes important later.
- Take the mean. Add up all the squared errors and divide by the number of data points. This gives you the mean squared error (MSE).
- Take the square root. This final step converts MSE back into the original units of the data. The result is RMSE.
In formula terms: square the difference between each observed value and its prediction, average those squared differences across all data points, then take the square root. The whole point of that last step is to undo the squaring so the result is something you can actually interpret. An MSE of 25,000,000 (squared dollars) is hard to think about. An RMSE of $5,000 makes immediate sense.
What RMSE Actually Tells You
RMSE represents the typical size of your model’s prediction errors, expressed in the same units as whatever you’re predicting. If you’re forecasting temperature and the RMSE is 2.3°F, the model’s predictions are off by roughly 2.3 degrees on a typical day. If you’re predicting revenue and the RMSE is $12,000, that’s the ballpark of how wrong the model tends to be.
This unit-matching property is what makes RMSE more practical than MSE for communicating results. MSE uses squared units (dollars squared, degrees squared), which have no real-world meaning. RMSE puts the error back on the same scale as your data, so you can compare it directly to the values you care about.
What Counts as a “Good” RMSE
There is no universal threshold for a good RMSE. The same value can be excellent in one context and terrible in another. Consider an RMSE of $500. For a model predicting home prices that range from $70,000 to $300,000, a $500 error is remarkably small. For a model predicting monthly spending that ranges from $1,500 to $4,000, a $500 error is huge.
One way to put RMSE in context is to normalize it by dividing by the range of your data (the maximum value minus the minimum). This produces a number between 0 and 1, where values closer to 0 indicate better fit. In the home price example, the normalized RMSE would be 500 / 230,000 = 0.002. In the monthly spending example, it would be 500 / 2,500 = 0.2. The first model is clearly performing far better relative to the scale of its data.
You can also compare RMSE values across different models trained on the same dataset. Lower is always better when the target variable and data are the same.
Why RMSE Is Sensitive to Outliers
The squaring step gives disproportionate weight to large errors. If most of your predictions are off by 2 or 3 units but one prediction is off by 50, that single outlier gets squared to 2,500 before being averaged in. This pulls the RMSE up significantly, even if the model performs well on the vast majority of data points.
This sensitivity is a feature when large errors are genuinely costly. If you’re predicting the structural load on a bridge, you want a metric that punishes big misses harshly. But it’s a drawback when your data contains noise or anomalies that don’t reflect real model failures. A few corrupt data points can inflate RMSE and make a good model look bad.
RMSE vs. MAE
Mean absolute error (MAE) is the most common alternative to RMSE. Instead of squaring the errors, MAE simply takes the absolute value of each error and averages them. This makes MAE less sensitive to outliers, since a big error doesn’t get amplified the way squaring does.
Neither metric is inherently better. RMSE is the theoretically optimal choice when errors follow a normal, bell-curve distribution. MAE performs better when the data contains outliers or when errors follow a heavier-tailed distribution. In practice, real-world data almost always includes some outliers, which is why MAE is often called a “robust” alternative. The statistician Ronald Fisher showed that minimizing squared error is theoretically ideal for perfectly normal data, but his contemporary Arthur Eddington noted that minimizing absolute error often worked better in practice, precisely because real observations include anomalies.
Both metrics are measured in the same units as the target variable, so both are easy to interpret. If you’re comparing models and care most about avoiding occasional large errors, RMSE is the better choice. If you want a metric that reflects typical performance without being skewed by a few extreme cases, MAE is more informative.
Where RMSE Is Commonly Used
RMSE shows up wherever someone is building a model that predicts a continuous number rather than a category. Regression models in machine learning are routinely evaluated with RMSE, whether they’re predicting stock prices, energy consumption, crop yields, or patient outcomes. Weather forecasting models use RMSE to quantify how far off temperature or rainfall predictions are from what actually happened. In geospatial work, RMSE measures how accurately a digital map aligns with real-world coordinates.
Deep learning models typically optimize MSE rather than RMSE during training, since the square root doesn’t change which model wins (it preserves the ranking). But when reporting results to humans, RMSE is preferred because the units are interpretable. You’ll often see both MSE used internally and RMSE reported externally for the same model.
Limitations to Keep in Mind
RMSE tells you the size of the error but nothing about its direction. An RMSE of $5,000 doesn’t reveal whether the model consistently overestimates, underestimates, or errs in both directions equally. For that, you need to look at the residuals themselves or use a metric like mean bias error.
RMSE also can’t be compared meaningfully across datasets with different scales. An RMSE of 10 on a dataset where values range from 0 to 20 is very different from an RMSE of 10 on a dataset ranging from 0 to 10,000. Normalized RMSE solves this by scaling the error relative to the data range, making cross-dataset comparison possible.
Finally, RMSE summarizes all errors into a single number, which means it can hide patterns. A model might have excellent RMSE overall but perform poorly on a specific subset of the data. Pairing RMSE with visual tools like residual plots gives a more complete picture of where a model succeeds and where it struggles.

