How Does Eye Tracking Work? IR Light to Gaze Data

Eye tracking works by bouncing invisible infrared light off your eyes and using cameras to detect exactly where the reflections land. A computer then calculates where you’re looking based on the relationship between two key landmarks: the center of your pupil and the bright spot (called a corneal reflection or “glint”) that the infrared light creates on the surface of your eye. As your gaze shifts, the pupil moves but the glint stays relatively fixed, and the changing distance between these two points tells the system precisely where your eyes are aimed.

The Pupil-Corneal Reflection Method

The most common approach in modern eye trackers is called pupil center corneal reflection, or PCCR. Here’s what happens in real time: one or more infrared light sources shine toward your face. Your cornea, the clear outer layer of your eye, acts like a tiny curved mirror and reflects a small bright dot back toward the camera. At the same time, the camera captures a clear image of your pupil, which appears as a dark circle.

Software identifies both the center of the pupil and the center of that bright corneal reflection in every video frame. When you look straight at the light source, the glint and the pupil center nearly overlap. When you shift your gaze to the left, right, up, or down, the pupil moves away from the glint in the corresponding direction. By measuring this offset, the system computes a gaze vector: essentially a line extending from your eye out into the world, pointing at whatever you’re focused on.

Infrared light is used because it’s invisible to you, so it doesn’t distract or cause discomfort. It also creates a strong contrast between the pupil and the surrounding iris, making it easier for the camera to lock onto the pupil’s edges with high reliability.

What’s Inside an Eye Tracker

A typical remote eye tracker (the kind built into a monitor or sitting on your desk) contains infrared LED emitters, one or more specialized cameras sensitive to near-infrared light, and a processing unit that runs the gaze estimation software. Wearable eye trackers designed for research in natural environments pack similar components into a glasses-like frame. Pupil Labs’ Neon system, for example, fits two infrared eye cameras capturing at 200 frames per second, a forward-facing scene camera, motion sensors, and microphones into a lightweight frame. The eye cameras have small sensors (192 x 192 pixels each) because they only need to image your eye at close range, not an entire scene.

Higher-end research trackers may run at 500 Hz, 1,000 Hz, or faster, capturing that many snapshots of your eye position per second. Consumer-grade systems, like those built into VR headsets and laptops, typically run at lower speeds but still perform well for their intended purpose.

Calibration: Teaching the System Your Eyes

Before an eye tracker can tell you where someone is looking on a screen, it needs to learn how that specific person’s eyes behave. This is what calibration does. You’ll typically be asked to stare at a series of dots, usually five or nine, that appear one at a time across the display. While you fixate on each dot, the tracker records the raw pupil and glint positions. Since the system knows the exact screen coordinates of each dot, it builds a mathematical mapping between your eye’s physical signals and specific locations on the display.

This step matters because everyone’s eyes are slightly different. The shape of your cornea, the distance between your eyes and the screen, and the exact geometry of your face all influence the raw data. Calibration accounts for these individual differences, and it’s the reason a well-calibrated tracker can pinpoint gaze to within about 0.5 to 1 degree of visual angle. At a normal viewing distance, that translates roughly to an area the size of your thumbnail held at arm’s length.

Two Approaches to Gaze Estimation

Once the hardware captures raw data, software has to convert it into a gaze point. There are two main strategies. Geometric models use the known 3D geometry of the eye, the positions of the cameras and light sources, and the calibration data to calculate a gaze vector mathematically. If the system also knows your head position and orientation, it can extend this vector to find where it intersects a screen or surface. Geometric methods are transparent and predictable, but their accuracy depends heavily on how precise the calibration is.

The alternative is a regression or machine-learning approach. Instead of building a geometric model of the eye, these systems learn directly from data: they take in raw camera images or pupil coordinates and output predicted gaze coordinates, trained on thousands of examples. Research comparing the two has found that regression-based methods deliver comparable accuracy to geometric ones, with the added benefit of flexibility. They can adapt to messier real-world conditions, provide confidence estimates for each prediction, and improve as more training data becomes available. Many modern commercial trackers use a hybrid of both.

What the Data Actually Looks Like

Raw eye tracking produces a stream of X-Y coordinates, one for every camera frame. But the real insights come from identifying two fundamental eye behaviors: fixations and saccades.

A fixation is when your eyes hold relatively still on a point of interest, typically lasting at least 200 milliseconds. This is when your brain is actually processing visual information. A saccade is the rapid jump between fixations, your eyes repositioning to a new spot. Saccades are fast, finishing in 30 to 120 milliseconds, and you’re essentially blind during them. Your brain suppresses the blurry motion so you never notice.

Software separates these events using velocity. During a fixation, the eyes move slowly (under 100 degrees per second, mostly just tiny drifts). During a saccade, they rocket above 300 degrees per second. By setting a velocity threshold, algorithms can cleanly label each data point as part of a fixation or a saccade, then generate maps and statistics showing where a person looked, for how long, and in what order.

Accuracy and Precision in Practice

Modern eye trackers achieve a mean accuracy of less than 1 degree of visual angle in the central visual field. That’s the average distance between where the system says you’re looking and where you’re actually looking. Precision, which measures how consistently the tracker reports the same position when your eyes are still, typically falls between 0.1 and 0.4 degrees. These numbers come from controlled lab conditions, though. In everyday use, accuracy can degrade depending on several factors.

Eyeglasses can partially block or distort infrared reflections. Drooping eyelids, narrow eye openings, and heavy mascara can interfere with pupil detection. Even blinking causes momentary data loss: when the eyelid passes over the infrared sensor’s light path, the reflected spot deforms and spreads by a factor of seven to eight times its normal size, making precise tracking impossible until the eye reopens. Head movement, changing lighting, and shifting distance from the tracker also introduce noise, which is why periodic recalibration helps maintain quality.

Alternatives to Camera-Based Tracking

Not all eye tracking uses cameras. Electrooculography, or EOG, measures tiny electrical signals produced by the eye itself. Your retina generates a small voltage difference between the front and back of the eye, making the eyeball act like a miniature battery. Electrodes placed on the skin around the eyes detect how this electrical field shifts as you look in different directions.

EOG works even when the eyes are closed, which makes it valuable in medical settings where patients may have drooping eyelids, lens implants, or other conditions that make camera-based tracking unreliable. It doesn’t require goggles or headgear, so it works well with small children or anyone who can’t tolerate equipment on their face. The trade-off is lower spatial resolution: EOG can tell you the direction and speed of eye movements, but it’s not precise enough to pinpoint exactly which word on a screen someone is reading.

How Eye Tracking Is Used

In virtual reality headsets, eye tracking enables a technique called foveated rendering. Your eyes can only see fine detail in a small central zone (the fovea), while your peripheral vision is much blurrier. Foveated rendering exploits this by tracking where you’re looking and rendering that spot in full quality while reducing detail everywhere else. This cuts the computing power needed to drive the display, making high-resolution VR feasible without massively expensive hardware.

In assistive technology, eye tracking lets people with severe motor impairments control computers using only their gaze. The user looks at an on-screen button or keyboard key and holds their gaze there for a set amount of time, called a dwell time, to trigger a selection. Typical dwell times tested in research range from 200 milliseconds to 2,000 milliseconds, with longer dwell times reducing accidental selections but slowing down interaction. Most systems let users adjust this threshold to match their comfort and control level.

Researchers in psychology and neuroscience use eye tracking to study attention, reading, decision-making, and cognitive load. Marketers use it to evaluate packaging design and website layouts. Automotive engineers embed trackers in dashboards to detect drowsy or distracted driving. In each case, the core technology is the same: infrared light, camera, pupil, glint, and software converting that signal into a map of human attention.