What Is Process Variation? Types, Causes, and Costs

Process variation is the natural and unnatural fluctuation in outputs that occurs every time a process runs. No process produces perfectly identical results every time, whether you’re manufacturing car parts, brewing coffee, or treating patients in a hospital. Understanding what causes that variation, and how much is acceptable, is the foundation of quality control across virtually every industry.

Two Types of Process Variation

All process variation falls into two categories, a distinction first formalized by Walter Shewhart and later expanded by W. Edwards Deming.

Common cause variation is built into the process itself. It’s the small, random fluctuation that will always be present no matter how well things are running. Think of an industrial oven controlled by a thermostat: the temperature drifts slightly up and down around the set point, but stays within an acceptable range. That drift is common cause variation. Individual human performance also falls here. Two workers following the same procedure will naturally produce slightly different results, and even a single worker’s output varies a little from hour to hour. This type of variation is generally contained within three standard deviations of the average.

Special cause variation comes from an external or assignable source, something that isn’t part of normal operations. Using the same oven example, if the thermostat fails and the temperature spikes suddenly, that’s a special cause. An untrained employee placed on a production line with no instruction is another classic example. Special causes push results outside the expected range and signal that the process is no longer in statistical control. Unlike common cause variation, special causes can be identified and eliminated.

The distinction matters because each type requires a different response. Trying to fix common cause variation by hunting for a specific problem wastes time and can actually make things worse. Ignoring special cause variation by treating it as normal lets real problems go unaddressed. Deming argued that most variation stems from the system itself, not from individual workers, which means improving common cause variation requires changing the process design rather than blaming people.

How Variation Is Measured

The most straightforward measure of variation is standard deviation, which tells you how spread out your process results are around their average. A small standard deviation means outputs are tightly clustered; a large one means they’re scattered.

But standard deviation alone doesn’t tell you whether your process is actually meeting requirements. That’s where process capability comes in. Capability compares the spread of your process to the width of your specification limits, the boundaries your customer or design requires.

The simplest capability metric, called Cp, divides the total allowable spread (the gap between your upper and lower specification limits) by the actual spread of your process (six standard deviations). A Cp of 1.0 means your process spread exactly fills the specification window, leaving no margin for error. A Cp of 2.0 means your process uses only half the available space. One useful analogy: think of Cp as how much wider your garage door is than your car.

Cp has a limitation, though. It assumes your process is perfectly centered between the specification limits, which it rarely is. Cpk adjusts for this by measuring how close your process average is to the nearest specification limit. A process can have a high Cp (tight variation) but a low Cpk (off-center), meaning it’s consistent but consistently drifting toward one edge. Most customers require a Cpk of at least 1.33, which corresponds to roughly a four-sigma process.

At the extreme end, a Six Sigma process produces no more than 3.4 defects per million opportunities, representing extraordinarily tight control over variation.

Detecting Variation With Control Charts

Control charts are the primary visual tool for monitoring process variation in real time. A control chart plots data points over time against a center line (the process average) and upper and lower control limits set at three standard deviations from the center. Points that fall within the limits and show no pattern suggest only common cause variation is present.

Four classic rules, known as the Western Electric rules, flag potential special causes:

One point beyond a 3-sigma limit. A single data point outside the control limits is an immediate signal.
Two out of three successive points beyond 2-sigma. Even if points haven’t crossed the control limit, clustering near the edge is suspicious.
Four out of five successive points beyond 1-sigma. A sustained drift toward one side suggests something is shifting.
Eight or more successive points on one side of the center line. Even if all points are within limits, a long run on one side indicates a process shift rather than random scatter.

When any of these patterns appear, the process needs investigation. Something specific has changed, and the goal is to find it and correct it before it produces defective output.

Why Variation Costs Money

High process variation directly increases costs in ways that compound quickly. Products that fall outside specifications must be scrapped or reworked, both of which consume materials, time, and labor without producing sellable output. When defective products reach customers, the costs escalate further. Warranty expenses alone typically range from 2% to 10% of the sale price, depending on the product and manufacturer. Beyond direct costs, unreliable products erode customer trust and lead to lost future sales, a far harder number to recover.

In healthcare, the stakes are different but equally significant. Clinical variation describes how patient care and outcomes differ across providers, hospitals, and regions. Some of this variation is warranted. Treating early-stage prostate cancer differently based on patient preferences is reasonable. But unwarranted variation, like prescribing antibiotics for viral infections despite clear guidelines against it, drives inconsistent outcomes and wasted resources. Studies of Medicare data have found meaningful variation in hospital lengths of stay and specialist visit frequency that can’t be explained by patient characteristics alone, suggesting systemic process inconsistencies.

Six Sources of Variation

Lean Six Sigma identifies six categories of inputs that contribute to process variation, sometimes called the 6 Ms:

Method: How the work is performed. Inconsistent procedures between shifts or departments are a common source.
Materials: Differences between batches of raw materials, even from the same supplier.
Machines: Equipment wear, calibration drift, or mechanical inconsistency.
People: Differences in training, experience, and fatigue among workers.
Measurement: Variation in the measurement system itself, including gauge accuracy and inspector interpretation.
Environment: Temperature, humidity, vibration, and other conditions that affect the process.

Identifying which of these categories is driving your variation is the first step toward reducing it.

Reducing Variation With DMAIC

The most widely used framework for systematically reducing process variation is DMAIC, a five-step method from Six Sigma methodology.

In the Define phase, you establish exactly what problem you’re solving, what process is involved, and what “good” looks like. In Measure, you collect baseline data on the process and display it visually using control charts, histograms, or box plots to understand your starting point. The Analyze phase uses tools like fishbone diagrams and “5 Whys” analysis to trace variation back to its root causes rather than just treating symptoms.

The Improve phase is where changes happen. Solutions that are embedded directly into process flow, like automated alerts or system-enforced steps, tend to produce more durable results than solutions that rely on human memory, such as policy memos or training alone. Finally, the Control phase builds monitoring into the process so improvements stick over time, often through ongoing control charts and standardized procedures.

Reducing common cause variation typically means redesigning the process itself: tightening tolerances on equipment, standardizing procedures across all operators, or sourcing more consistent materials. Reducing special cause variation is more targeted: fixing the broken thermostat, retraining the new hire, or replacing the worn tool. Both matter, but the approach for each is fundamentally different.