How Does the Visual System Work: Eye to Brain

Your visual system converts light into a rich, continuous perception of the world through a chain of events that starts at the front of the eye and ends deep in the brain. The entire process, from photon hitting your retina to conscious awareness, takes only a fraction of a second. But within that sliver of time, light is bent, translated into electrical signals, sorted into parallel streams of information, and assembled into everything you see: color, shape, motion, and depth.

How the Eye Focuses Light

Light enters the eye through the cornea, the clear dome at the front. The cornea is responsible for roughly 65% to 75% of the eye’s total focusing power. It bends incoming light sharply because of the large difference between the refractive index of air and the cornea’s own dense tissue. Behind the cornea, the lens provides the remaining focusing power and, critically, can change shape. Tiny muscles surrounding the lens contract or relax to make it thicker or flatter, adjusting focus for objects at different distances. This process, called accommodation, is why you can shift your gaze from a book in your hands to a sign across the street and see both clearly (though not simultaneously).

Together, the cornea and lens project a focused, inverted image onto the retina, a thin layer of tissue lining the back of the eye. The image is genuinely upside down and reversed left to right. Your brain learns to interpret this flipped projection as right-side-up from very early in development.

Turning Light Into Electrical Signals

The retina is where vision truly begins at a cellular level. It contains two main types of photoreceptor cells: rods and cones. Rods are extraordinarily sensitive to light and handle vision in dim conditions, but they don’t detect color. Cones work best in normal to bright light and come in three varieties, each tuned to respond most strongly to short (blue), medium (green), or long (red) wavelengths of light. You have around 6 million cones concentrated near the center of the retina and roughly 120 million rods distributed more toward the periphery.

When a photon of light strikes a rod cell, it is absorbed by a light-sensitive molecule called rhodopsin. That absorption triggers a tiny shape change in the molecule, flipping a small chemical component from one configuration to another. This shape change sets off a cascade: rhodopsin activates a signaling protein, which in turn activates an enzyme that breaks down a chemical messenger inside the cell. As levels of that messenger drop, ion channels on the cell’s surface close, changing the cell’s electrical charge. That electrical shift is the signal. In cones, the process is nearly identical but uses slightly different light-sensitive molecules tuned to different wavelengths. The entire cascade, from photon absorption to electrical change, unfolds in just a few milliseconds.

How Color Vision Works

Color perception relies on two complementary mechanisms. At the photoreceptor level, your three cone types each respond most strongly to a different part of the spectrum, but their sensitivities overlap considerably. The brain determines color by comparing the relative activation levels across all three cone types. A lemon, for example, strongly activates your long-wavelength and medium-wavelength cones but barely triggers the short-wavelength ones, and that ratio is what your brain reads as “yellow.”

At the next stage, retinal circuitry reorganizes these three cone signals into opponent pairs: red versus green, blue versus yellow, and light versus dark. Specialized cells compute the difference between these paired inputs. This is why you can imagine a yellowish-red (orange) but can never perceive a reddish-green or a bluish-yellow. Those combinations cancel each other out in the opponent system. It also explains color afterimages: stare at a red square for 30 seconds, look at a white wall, and you’ll see a ghostly green square, because the red side of the red-green channel fatigues and the green side temporarily dominates.

Parallel Pathways to the Brain

After photoreceptors convert light to electrical signals, those signals pass through several layers of neurons within the retina itself. By the time information leaves the eye, it has already been processed and sorted. The final output neurons of the retina are called retinal ganglion cells, and their long fibers bundle together to form the optic nerve. Each optic nerve contains just over one million nerve fibers on average, though this number varies significantly between individuals, ranging from about 400,000 to over 1.5 million.

These ganglion cells aren’t all the same. They divide into two major streams that carry different kinds of visual information in parallel:

The parvocellular (P) pathway carries information about color and fine detail. P cells have small receptive fields, meaning they respond to tiny, precise areas of the visual scene. Damage to this pathway severely impairs the ability to see sharp detail and color but leaves motion perception intact.
The magnocellular (M) pathway carries information about motion and rapid changes. M cells have large receptive fields and respond well to quickly moving stimuli but are essentially color-blind. Damage to this pathway sharply reduces the ability to perceive movement but barely affects sharpness or color vision.

This separation means your brain isn’t receiving a single video feed. It’s getting distinct channels of information about what things look like and how they’re moving, processed simultaneously.

The Relay Station: Lateral Geniculate Nucleus

The optic nerves from both eyes partially cross and then project to a structure in the thalamus called the lateral geniculate nucleus (LGN). The LGN is organized into six distinct layers. The bottom two layers contain large magnocellular neurons receiving motion-related input. The top four layers contain smaller parvocellular neurons receiving detail and color input. A third cell type, koniocellular cells, sits between these layers and contributes additional color-related information.

The LGN is not simply a relay. It receives far more input from higher brain areas than it does from the eyes themselves. Feedback signals from the visual cortex and from attention-regulating brain regions modulate what the LGN passes along. This means your brain is already filtering visual information before it reaches the cortex, potentially suppressing irrelevant signals and enhancing important ones. The LGN also keeps information from the left and right eyes separate at this stage, with specific layers dedicated to each eye.

The Visual Cortex: Building a Picture

From the LGN, signals travel via nerve fiber bundles to the primary visual cortex (V1), located at the very back of the brain. V1 is where the raw, sorted signals from the eye begin to be assembled into recognizable visual features. Neurons in V1 respond preferentially to edges and bars of specific orientations. One neuron might fire strongly in response to a vertical line but barely respond to a horizontal one. Its neighbor might prefer lines tilted at 45 degrees. These orientation-selective neurons are arranged in an organized, smoothly varying pattern across the cortical surface, so that a small patch of V1 collectively covers all possible orientations for one small part of the visual field.

V1 also begins to combine information from both eyes. Neurons here compare the slightly different images arriving from each eye, a process essential for depth perception. Because your eyes are separated by a few centimeters, each one sees the world from a slightly different angle. An object closer to you produces a larger difference between the two retinal images than a distant one. Cells in V1 are tuned to detect these differences, encoding them primarily through variations in the spatial profile of their responses to each eye rather than through simple positional offsets. This is the neural foundation of stereoscopic depth perception.

The “What” and “Where” Streams

Beyond V1, visual information splits into two major processing streams that flow through different parts of the brain. The ventral stream runs along the underside of the brain toward the temporal lobe. It processes object properties like shape, texture, and identity. Key areas along this route include the lateral occipital cortex and the fusiform gyrus, regions essential for recognizing faces, objects, and scenes. This is often called the “what” pathway.

The dorsal stream flows upward toward the parietal lobe and processes spatial information: where objects are, how they’re moving, and how to interact with them physically. The intraparietal sulcus and surrounding areas of the posterior parietal cortex are central hubs in this stream. This “where” (or “how”) pathway is what allows you to reach out and catch a ball or navigate through a crowded room without bumping into people.

These two streams are not completely independent. Brain imaging studies show that dorsal regions also contribute to shape perception, and ventral regions show some activity during spatial tasks. The streams interact and share information, but each one carries a heavier load for its specialized function. Together, they allow you to simultaneously recognize your coffee mug (ventral stream) and guide your hand to pick it up (dorsal stream) without consciously thinking about either task.

Depth Perception Beyond Stereopsis

Binocular disparity is the most precise depth cue your brain uses, but it’s not the only one. Your visual system also extracts depth from monocular cues that work even with one eye closed. Objects that overlap others appear closer. Parallel lines that converge toward a point suggest distance. Texture gradients, where a surface’s pattern appears finer as it recedes, signal depth. Atmospheric haze makes distant mountains look paler and bluer than nearby ones. Motion parallax, the way nearby objects sweep past your field of view faster than distant ones when you move your head, provides another powerful depth signal.

Your brain combines all of these cues, weighting each one according to its reliability in the current situation. In a well-lit room with both eyes open, binocular disparity dominates. On a foggy highway, atmospheric cues and motion parallax become more important. This flexible, multi-source approach to depth perception is part of why vision feels so effortlessly three-dimensional even in widely varying conditions.