A risk matrix is a simple grid that maps how likely something is to happen against how bad it would be if it did. One axis shows likelihood (from rare to almost certain), the other shows impact (from negligible to catastrophic), and each risk you’re evaluating gets placed in the cell where those two ratings meet. The result is a visual snapshot that helps teams decide which risks deserve immediate attention and which ones can be monitored over time.
Organizations use risk matrices across industries, from workplace safety and engineering to cybersecurity and project management. The tool is popular because it’s fast, intuitive, and doesn’t require complex statistical modeling. But it also has real limitations worth understanding before you rely on one for important decisions.
How the Grid Works
The most basic risk matrix is a simple table. The vertical axis represents likelihood, broken into categories like “rare,” “unlikely,” “possible,” “likely,” and “almost certain.” The horizontal axis represents impact, with labels like “negligible,” “marginal,” “critical,” and “catastrophic.” Where a given risk falls on each axis determines its cell in the grid, and that cell corresponds to a risk level.
Grid sizes vary widely. A 3×3 matrix offers nine possible risk ratings and works well for quick, high-level assessments. A 5×5 matrix gives 25 cells and allows finer distinctions between risks. Some industries push well beyond that. Chemical engineering and process safety applications sometimes use 7×7 or even 9×9 matrices when they need more resolution at the low-frequency end of the scale, where the difference between “extremely rare” and “almost impossible” actually matters for decision-making.
The Center for Chemical Process Safety published a widely referenced 4×4 matrix in 1992, and many international standards use a 4×5 layout. There’s no single “correct” size. The right choice depends on how many meaningfully different risk levels your team needs to distinguish.
Scoring and Color Coding
Most risk matrices use a color-coded heat map to make priorities immediately visible. Green cells represent low-level risks that typically require monitoring but no urgent action. Yellow or amber cells indicate moderate risks that may need mitigation plans. Red cells flag the highest-priority risks, the ones that demand immediate resources or could halt a project.
Behind the colors, there’s usually a scoring system. In a purely qualitative matrix, you’re simply combining descriptive labels: a “high likelihood, high impact” risk lands in the red zone by definition. Semi-quantitative matrices assign numbers to each category and multiply or add them to produce a risk score. One common approach uses the formula: severity equals likelihood plus N times impact, where N reflects how much more weight you give to consequences versus probability. Setting N to 2, for instance, means impact counts twice as much as likelihood in the final score. With that weighting on a 5×5 grid, scores range from 3 (lowest corner) to 15 (highest corner), with red typically covering scores of 12 to 15, yellow from 8 to 11, and green below 8.
That weighting factor is a choice, not a mathematical truth. Different values of N shift the boundaries between zones and can change which risks end up flagged as critical. This is one reason two organizations looking at the same set of risks can reach different conclusions using different matrices.
Qualitative vs. Semi-Quantitative Matrices
A purely qualitative risk matrix uses descriptive categories only. Both likelihood and impact are rated on relative scales like low, medium, and high. Risk for each scenario is determined by the combination of those two ratings. A 3×3 qualitative matrix produces nine distinct risk pairs (low-low, low-medium, low-high, and so on), each assigned a relative grade from 1 to 5 or mapped directly to a color zone.
Semi-quantitative matrices bring in numbers, usually broad frequency ranges on the likelihood axis. Instead of labeling something “unlikely,” you might place it in a bin representing a probability between 0.2% and 1%, or roughly 1 to 5 chances in 500. The UK National Risk Register, for example, uses five likelihood bands ranging from less than 0.2% to greater than 25%. The consequence axis often stays qualitative, describing impacts in terms of injuries, financial loss, or operational disruption rather than assigning precise dollar figures.
Neither approach produces absolute risk values. You can rank risks relative to each other within the same matrix, but you can’t directly compare risk scores between two matrices built on different scales or assumptions.
The ALARP Principle and Risk Thresholds
Some risk matrices build in decision thresholds that go beyond simple red-yellow-green coding. One widely used framework is the ALARP principle, which stands for “as low as reasonably practicable.” It divides the matrix into three zones. Risks in the lowest zone are broadly acceptable and don’t require further action. Risks in the highest zone are unacceptable and must be reduced before work proceeds. The middle zone is where ALARP applies: the risk is tolerable, but only if you can demonstrate that any further reduction would cost disproportionately more than the benefit it provides.
In high-hazard industries like oil and gas or chemical processing, the boundaries matter. New facilities are held to a stricter standard than existing ones, because it’s easier and cheaper to design out risk during the planning phase than to retrofit safety measures later. An existing facility might be allowed to operate in a risk zone that would be classified as unacceptable for a new build.
Known Limitations
Risk matrices are widely used, but they’ve also drawn serious criticism from risk analysts. A landmark paper titled “What’s Wrong with Risk Matrices?” identified several structural problems that users should understand.
The most significant is range compression. Because a matrix forces continuous values into a small number of categories, risks that are quantitatively very different can end up in the same cell. A hazard with a 6% probability and one with a 24% probability might both fall into a “likely” category, even though one is four times more probable than the other. This flattening of distinctions can lead teams to treat genuinely different risks as equivalent.
A related problem is rank reversal, where the matrix assigns a higher qualitative rating to a risk that is quantitatively smaller. This happens because the category boundaries are arbitrary. A risk that sits just above a threshold on one axis gets bumped into a higher category, while a numerically larger risk that falls just below the threshold on both axes stays in a lower zone.
Subjectivity in Scoring
Even a well-designed matrix depends entirely on the judgment of the people filling it in. When a team sits down to rate a risk’s likelihood and impact, cognitive biases shape the results. People tend to anchor on the first number suggested, overweight recent or dramatic events, and struggle to distinguish between probability levels they haven’t personally experienced. A risk that was in the news last month often gets rated as more likely than one that’s statistically more common but less memorable.
The weighting between likelihood and impact introduces another layer of subjectivity. Choosing to weight impact twice as heavily as likelihood (N equals 2 in the severity formula) reflects a belief, not a measurement. Organizations rarely examine or justify this assumption explicitly, yet it directly determines which risks cross into the red zone. Changing the weighting shifts zone boundaries and can reclassify risks from moderate to critical or vice versa.
Studies on how people interpret risk matrices have found that matrix design itself affects decisions. The number of categories, the color scheme, and the labels all influence how users assign ratings and read results. This means two teams using differently designed matrices can reach different conclusions about the same set of risks, not because they disagree about the underlying facts, but because the tool itself steered them toward different answers.
When a Risk Matrix Is Most Useful
Risk matrices work best as a communication and prioritization tool, not as a precision instrument. They’re effective for getting a cross-functional team to discuss risks in a structured way, for creating a shared visual that leadership can quickly scan, and for establishing a starting point when detailed quantitative data isn’t available.
They’re less appropriate when you need to compare risks with high precision, when the stakes involve catastrophic but extremely rare events (where the category bins are too coarse to be meaningful), or when you need to justify specific resource allocations with defensible numbers. In those cases, quantitative risk assessment methods that use actual probability distributions and consequence modeling provide more reliable results.
If you’re building a risk matrix for your team, the key decisions that shape its usefulness are the number of categories on each axis, the definitions behind each category label, the weighting between likelihood and impact, and the thresholds that separate risk zones. Documenting those choices explicitly makes the matrix more consistent across users and easier to update as your understanding of each risk evolves.

