What Is a Stacked CMOS Sensor and How Does It Work?

A stacked CMOS sensor is an image sensor built in multiple layers, with the light-gathering pixel array on top and processing circuitry sandwiched directly beneath it. This design lets the sensor read and process image data dramatically faster than a conventional single-layer sensor, which is why it shows up in flagship cameras and smartphones capable of shooting 30 frames per second or recording 960fps slow-motion video.

How a Stacked Sensor Is Built

Conventional CMOS sensors put everything on one flat chip: the pixels that capture light and the wiring that carries electrical signals to processing circuits around the edges. That means every signal has to travel along strips of wiring all the way to the outside of the sensor before it gets processed, creating a bottleneck.

A stacked sensor separates these jobs into distinct layers bonded together vertically. The top layer holds the pixel array, which converts light into electrical signals. Directly beneath it sits an analog logic layer that converts those signals into digital data. Because the processing circuitry lives right underneath each pixel rather than off to the side, signals travel a fraction of the distance. The result is faster readout, a smaller physical footprint, and lower power consumption.

Some sensors add a third layer. Samsung’s ISOCELL Fast 2L3, for example, bonds a 2-gigabit DRAM chip below the logic layer. This built-in memory lets the sensor temporarily store a burst of high-speed frames before sending them to the phone’s main processor. That memory buffer is what makes 960fps super-slow-motion video possible on a smartphone. Sony has also shipped triple-layer sensors in devices like the Xperia 1 V, using that extra silicon for on-chip AI processing and multimodal sensing.

How It Differs From a BSI Sensor

Back-side illuminated (BSI) sensors solved a different problem. Traditional sensors had wiring on the light-facing side, partially blocking incoming photons. BSI flipped the chip so light hits the photodiodes directly, increasing sensitivity and reducing noise. That’s an improvement in image quality.

Stacking is about speed. A BSI sensor can still be a single-layer design with the same readout bottleneck. A stacked BSI sensor combines both advantages: the pixel layer faces the light for maximum sensitivity, while processing happens on a separate layer underneath for maximum speed. Today’s top cameras use “stacked BSI CMOS” sensors that incorporate both innovations.

Why Readout Speed Matters

The speed at which a sensor reads out its pixels affects almost everything about camera performance. Faster readout means higher burst frame rates, less rolling shutter distortion, and better video capabilities.

Rolling shutter is the wobble or skew you see when panning quickly or photographing a fast-moving subject. It happens because a CMOS sensor reads its rows of pixels sequentially from top to bottom. If the subject moves significantly during that scan, the top and bottom of the frame capture slightly different moments. A stacked sensor reads the entire frame so quickly that there’s far less time for the subject to shift between the first row and the last, which virtually eliminates visible distortion in most shooting situations.

In specialized applications the gains are even more striking. Stacked sensors designed for medical endoscopes can deliver over 100 frames per second while consuming just 40 milliwatts, compared to a typical 30fps at a full watt for conventional designs. That combination of speed and efficiency opens doors in fields where size and heat are serious constraints.

Image Quality Improvements

Speed isn’t the only benefit. Having dedicated processing circuitry directly beneath the pixels enables techniques that improve dynamic range, noise, and color accuracy.

Each pixel’s output can be split across multiple processing taps and handled in parallel. This makes it practical to capture several rapid exposures and combine them into a single high dynamic range (HDR) image, all within the sensor itself rather than relying on slower external processing. Because the samples are taken so close together in time, motion artifacts between exposures are minimized.

Pixels in a stacked design can also have multiple memory nodes, which allows multi-sampling readout. Reading the same pixel multiple times and averaging the results reduces random noise, particularly in low light. And because the logic layer can be manufactured using a more advanced, smaller process node than the pixel layer, designers can pack in more sophisticated circuitry without stealing space from the light-collecting area.

Where You’ll Find Stacked Sensors

The most visible use today is in flagship mirrorless cameras. The Sony A1 pairs a 50.1-megapixel stacked BSI sensor with 30fps continuous shooting. The Nikon Z9 uses a 45.7-megapixel stacked sensor to deliver 20fps bursts that can run past 1,000 consecutive raw frames. The Canon EOS R3, with a 24-megapixel stacked sensor, hits 30fps and records up to 420 raw frames in a burst. None of these frame rates were achievable at full resolution before stacked architecture arrived.

Smartphones have arguably pushed stacked sensor development even harder, since phone cameras face tighter space and power budgets. Sony’s IMX-series stacked sensors power the main cameras in most flagship phones. The triple-layer designs now appearing in high-end models are shifting the industry’s focus from simply increasing megapixel counts toward intelligent, on-chip processing that improves dynamic range, sensitivity, and power efficiency simultaneously.

Two-Layer vs. Three-Layer Designs

A standard two-layer stacked sensor has the pixel array on top and analog/digital logic below. This is the architecture in most current stacked cameras and delivers the core speed advantage.

Three-layer designs add a functional middle or bottom tier. In Samsung’s approach, the third layer is DRAM, acting as a high-speed frame buffer. The sensor captures a full-frame snapshot in 1/120 of a second and can record 960fps slow-motion video because it dumps frames to that onboard memory faster than any external data path could handle. Research prototypes have gone further, splitting 16-bit counters across middle and lower layers to fit per-pixel analog-to-digital converters into smaller pixels, a design that could eventually bring stacked-sensor advantages to higher-resolution, smaller-pixel chips.

Sony has described a roadmap where increasing the processing power at the sensor level improves dynamic range, sensitivity, noise, power efficiency, readout speed, and resolution all at once. The trend is toward sensors that don’t just capture light but interpret it, running AI-driven tasks like scene recognition or object tracking before data ever leaves the chip.