How Autonomous Vehicles Work: Sensors to Software

An autonomous vehicle works by combining sensors that observe the world, software that interprets and plans, and electronic controls that physically steer, brake, and accelerate the car. Every fraction of a second, the vehicle repeats a loop: perceive the surroundings, figure out where it is, decide what to do, then execute that decision through its mechanical systems. The complexity lives in each of those steps and how they work together.

Sensors: How the Car Sees

Self-driving cars don’t rely on a single type of sensor. They layer multiple technologies on top of each other so each one compensates for the others’ weaknesses. The three primary sensors are cameras, radar, and LiDAR.

Cameras capture color and texture, reading lane markings, traffic lights, signs, and the general shape of the road. They work much like your eyes but struggle in low light, heavy rain, or direct glare. They’re also the cheapest sensor in the stack, which is why every autonomous vehicle uses several of them, often eight to twelve, pointed in every direction.

Radar sends out radio waves with wavelengths between 3 millimeters and 30 centimeters and measures how they bounce back. It excels at detecting how far away an object is and how fast it’s moving, even several hundred meters out. Critically, radar works through fog, rain, snow, dust, and darkness. It’s the most weather-resistant sensor on the car. Newer 4D radar systems add the ability to estimate an object’s elevation and rough shape, not just its distance and speed.

LiDAR fires rapid pulses of near-infrared laser light (wavelengths around 905 to 1,550 nanometers) and times how long each pulse takes to return. The result is a dense 3D point cloud, essentially a detailed depth map of everything around the vehicle with millimeter-level accuracy at mid-range distances. LiDAR gives the car precise information about object shapes and drivable space. Its weakness is adverse weather: rain, fog, and wet or very dark surfaces can scatter or absorb the laser pulses, reducing accuracy. The technology is shifting toward solid-state designs without spinning parts, which lowers cost and improves durability.

By fusing data from all three sensor types together, the car builds a rich, layered model of its environment that no single sensor could produce alone. Cameras identify what something is (a pedestrian, a stop sign), radar tracks how fast it’s moving, and LiDAR maps its exact shape and position in 3D space.

Localization: Knowing Exactly Where It Is

GPS alone isn’t accurate enough for driving. Standard GPS can drift by several meters, which is the difference between being in your lane and in oncoming traffic. Self-driving cars need to know their position to within roughly 10 to 25 centimeters.

To achieve this, vehicles compare their real-time sensor data against pre-built high-definition maps. These HD maps contain precise information about lane boundaries, curb heights, traffic signal positions, and road geometry. The car’s LiDAR or camera scans the environment and matches what it sees to the map, calculating its exact position and orientation. Techniques like RTK (a method of correcting GPS signals using a fixed reference station) can reach centimeter-level accuracy, and point-cloud matching algorithms achieve similar precision by aligning laser scans against stored map data.

A related technique called SLAM (Simultaneous Localization and Mapping) estimates the car’s position relative to its previous position, frame by frame, like tracking your own movement through a room. SLAM is useful for filling gaps when map-matching data is momentarily unavailable, but on its own it can’t pinpoint the car’s absolute location on Earth. Most self-driving systems combine map-matching for absolute position with SLAM for smooth, continuous tracking between map updates.

Perception: Making Sense of the Scene

Raw sensor data is just points, pixels, and radio reflections. The perception system uses neural networks to turn this data into meaningful information: there’s a cyclist two lanes over, that shape is a parked truck, this patch of road is drivable.

Object detection algorithms classify everything in view (cars, pedestrians, traffic cones, animals) and draw 3D boundaries around them. Object tracking follows each detected item across multiple frames so the car knows not just where things are, but where they’re heading and how fast. Lane detection identifies road boundaries and markings. Traffic light and sign recognition reads signals and posted rules.

All of this processing happens on powerful onboard computers, often using specialized chips designed for running neural networks. The entire perception cycle repeats many times per second to keep the car’s understanding of the world current.

Planning and Decision-Making

Once the car knows where it is and what’s around it, it needs to decide what to do. This happens at two levels.

Global path planning determines the overall route from point A to point B, similar to what a navigation app does. It uses road network data and algorithms (like variations of A*, a classic route-finding method) to pick the most efficient path, accounting for road closures, traffic, and turn restrictions.

Local path planning handles the moment-to-moment driving: staying in the lane, changing lanes to pass a slow vehicle, yielding to a pedestrian, or navigating around a double-parked delivery truck. These decisions happen in real time and rely on algorithms that rapidly generate and evaluate possible paths. One common family of algorithms, called RRT (Rapidly-exploring Random Trees), works by branching out possible trajectories like a growing tree, checking each one for collisions, then selecting and smoothing the best option using curves that the car can physically follow. The car also predicts what other road users are likely to do in the next few seconds, adjusting its plan accordingly.

Speed control, gap selection during merges, and the timing of braking all fall under this layer. The planner must balance safety, comfort, traffic rules, and efficiency in every decision.

Modular vs. End-to-End Software

There are two broad philosophies for how all this software is organized. The modular approach, which is more common in industry today, divides the driving task into separate components: perception, localization, planning, and control. Each module has clearly defined inputs and outputs, making the system’s behavior more predictable and easier to debug. If the car makes a mistake, engineers can trace the error to a specific module. The tradeoff is that information gets lost between modules. When the perception system boils a camera image down to a set of bounding boxes around objects, every other detail in those pixels is discarded.

The end-to-end approach trains a single large neural network to go directly from raw sensor input to driving output (steering, braking, acceleration). This lets the system learn to use information that a modular pipeline might throw away. It can also adapt more fluidly to unusual situations. The major downside is interpretability: when the system makes a decision, it’s difficult to understand why, which creates challenges for safety validation and regulatory approval.

Many current self-driving systems use a hybrid, applying neural networks heavily within individual modules while keeping the overall pipeline structured and inspectable.

Drive-by-Wire: Turning Decisions Into Motion

The physical layer that translates the computer’s decisions into actual vehicle movement is called drive-by-wire. In a conventional car, your steering wheel is mechanically connected to the front wheels, and your brake pedal physically pushes hydraulic fluid. In a drive-by-wire system, those mechanical links are replaced with electronic signals.

For steering, the software sends a target wheel angle as an electronic signal to a motor attached to the steering column. A sensor reads the steering wheel’s current position, and a control loop continuously adjusts the motor until the wheels reach the desired angle. For acceleration, the software sends a voltage signal directly to the motor controller (in electric vehicles) or to an electronic throttle. The controller adjusts power output to match the requested speed, using an encoder on the motor to continuously verify actual speed against the target. Braking works on the same principle: an electronic signal commands brake actuators rather than relying on a physical pedal connection.

This electronic control is what makes autonomy possible. It gives the computer precise, repeatable authority over every aspect of vehicle movement.

Vehicle-to-Everything Communication

Beyond its own sensors, an autonomous vehicle can communicate with the world around it through a technology called V2X, short for vehicle-to-everything. This includes vehicle-to-vehicle (V2V) communication, where cars share speed, position, and braking data with each other, and vehicle-to-infrastructure (V2I), where traffic signals, construction zones, or toll systems send information directly to the car. Vehicle-to-pedestrian (V2P) communication is also under development, potentially alerting the car to nearby smartphone-carrying pedestrians around blind corners.

V2X extends the car’s awareness beyond what its sensors can physically see. A vehicle three cars ahead slamming on its brakes, a traffic light about to change, or an emergency vehicle approaching from two blocks away can all be communicated electronically before the car’s cameras or radar would detect them.

The Six Levels of Automation

The SAE International standard defines six levels of driving automation, numbered 0 through 5, which describe how much the car handles versus how much you do.

Level 0: No automation. The car may have warnings like a blind-spot alert, but you do all the driving.
Level 1: The car can help with either steering or speed (like adaptive cruise control or lane-keeping), but not both simultaneously. You must constantly supervise and intervene as needed.
Level 2: The car can handle both steering and speed at the same time. You still need to stay alert and take over when the system requests it. Most “self-driving” features on consumer cars today, like Tesla’s Autopilot or GM’s Super Cruise, operate here.
Level 3: The car drives itself in certain conditions, and you don’t need to monitor the road while the feature is active. However, you must be ready to take over when the system asks, typically with a few seconds’ notice.
Level 4: The car handles all driving within its defined operating area (a specific city, a highway corridor, a geo-fenced zone) and will not ask you to take over. If it encounters a situation it can’t handle, it pulls over and stops safely on its own. Robotaxis from companies like Waymo operate at this level.
Level 5: Full automation everywhere, under all conditions, with no human intervention needed and no steering wheel required. This level does not exist in any commercially available vehicle today.

The jump from Level 2 to Level 3 is the most significant boundary. Below it, you are always responsible for driving. At Level 3 and above, the car takes legal and functional responsibility for the driving task when the automated system is engaged.