What Is Packet Coalescing? How It Works and When to Use It

Packet coalescing is a technique where your network adapter collects multiple incoming packets and delivers them to the CPU in a single batch, rather than interrupting the processor for every individual packet that arrives. This reduces CPU overhead, lowers power consumption, and improves throughput, especially during high-bandwidth transfers. The trade-off is a small amount of added latency, typically measured in microseconds.

How Packet Coalescing Works

Every time your computer receives data over a network, the network interface card (NIC) needs to tell the CPU that new information has arrived. It does this by triggering a hardware interrupt, which forces the processor to pause whatever it’s doing and handle the incoming data. Without coalescing, the NIC fires an interrupt for every single packet. At gigabit speeds and above, that can mean tens of thousands of interrupts per second, each one stealing CPU cycles from everything else your system is trying to do.

Packet coalescing (also called interrupt coalescing or interrupt moderation) changes this by having the NIC hold incoming packets in a small buffer. Once the buffer reaches a set number of packets or a set amount of time has passed, the NIC fires one interrupt for the entire batch. The CPU then processes all the buffered packets at once. Research from the University of Michigan found that setting the minimum interval between interrupts to just 10 microseconds was enough to substantially reduce interrupt overhead during high-bandwidth workloads.

Packets belonging to the same data stream are grouped together as a block within each interrupt cycle. This grouping is important because it allows the CPU to process related data more efficiently, rather than constantly switching context between unrelated streams.

The Latency Trade-Off

The cost of coalescing is delay. When the NIC holds packets in a buffer instead of delivering them immediately, each packet sits waiting a little longer before the CPU sees it. Under light network loads, this added latency is roughly equal to the buffer timer setting. Under heavy loads, the buffer fills quickly and fires interrupts more frequently anyway, so the added delay shrinks.

For most workloads, this trade-off is overwhelmingly positive. Research published through USENIX showed that even with microsecond-scale delays on individual packets, deliberate packet batching increased throughput for a network service chain by up to 84% and reduced page load times for a web server by 11% while boosting its throughput by 20%. The key insight is that a tiny, controlled delay on individual packets often produces large gains in overall system performance.

The applications most sensitive to coalescing delays are those that need the absolute lowest possible latency on every single packet: high-frequency trading systems, real-time audio/video communication, and certain gaming servers. For these workloads, administrators sometimes disable or reduce coalescing to shave off microseconds of delay, accepting higher CPU usage as the cost.

TCP vs. UDP Support

Traditional packet coalescing works with TCP traffic but not UDP. The reason comes down to how the two protocols structure data. TCP is stream-based, meaning there are no rigid boundaries between chunks of data. The NIC can safely merge several smaller TCP segments into one larger segment before handing it to the CPU. This is the basis for features like Large Receive Offload (LRO) on Linux and Receive Segment Coalescing (RSC) on Windows.

UDP, by contrast, treats each packet as a self-contained message with defined boundaries. Merging two UDP packets into one would destroy the structure that applications expect. So while interrupt coalescing (batching the notifications) still works for UDP traffic, the deeper optimization of merging packet contents into larger segments is limited to TCP.

Power Savings on Mobile Devices

Packet coalescing has a significant impact on battery life. Mobile devices spend a lot of energy waking the processor from sleep states to handle incoming network traffic. Real-world network workloads tend to be bursty and random: a push notification here, a background sync there, each one potentially waking the CPU from a low-power state.

By bundling these arrivals together, coalescing lets the processor stay asleep longer between wake events. Research published in IEEE’s Journal on Selected Areas in Communications found that an adaptive traffic coalescing scheme reduced power consumption by around 20% for real-world internet workloads without noticeably affecting performance or user experience. Microsoft’s NDIS 6.30 specification explicitly added packet coalescing support to reduce the processing overhead and power consumption caused by random broadcast and multicast packets, the kind of background network chatter that would otherwise keep waking the CPU for packets the system doesn’t even need.

Configuring Coalescing on Linux

On Linux systems, you control packet coalescing through the ethtool command. To see your current coalescing settings for a network interface, run:

ethtool -c eth0

To change settings, use the -C flag. The two most commonly adjusted parameters are:

  • rx-usecs N: the maximum number of microseconds the NIC waits before firing an interrupt after receiving a packet
  • rx-frames N: the maximum number of packets the NIC collects before firing an interrupt

Whichever threshold is reached first triggers the interrupt. For example, setting rx-usecs to 50 and rx-frames to 64 means the NIC will interrupt the CPU after 50 microseconds or 64 packets, whichever comes first. The same parameters exist for the transmit side (tx-usecs and tx-frames).

Many modern NICs also support adaptive coalescing, which automatically adjusts these thresholds based on current traffic patterns. You enable it with adaptive-rx on or adaptive-tx on. Adaptive mode raises the coalescing thresholds during heavy traffic (to reduce interrupt overhead) and lowers them during light traffic (to reduce latency). For most general-purpose servers, adaptive mode is a reasonable default. If you need per-queue control on multi-queue NICs, ethtool supports that through the -Q flag with a queue mask.

When to Adjust Coalescing Settings

The default coalescing settings on most operating systems work well for typical workloads, but there are situations where tuning helps. If you’re running a high-throughput file server or streaming large datasets, increasing the coalescing thresholds can meaningfully reduce CPU usage by processing more packets per interrupt. On a busy 10-gigabit link, the difference between handling one packet per interrupt and handling dozens per interrupt translates directly into CPU cycles available for actual application work.

Going the other direction, if you’re running a latency-sensitive application and notice that response times have a floor suspiciously close to your coalescing timer, reducing or disabling coalescing may help. Setting rx-usecs to 0 tells the NIC to interrupt immediately on every packet. Your CPU usage will climb, but each packet reaches your application as quickly as the hardware allows.

One subtle problem to watch for: on very fast integrated NICs, the CPU can process interrupts so quickly that it ends up handling fewer packets per interrupt than intended, which actually increases interrupt overhead rather than reducing it. If you see unexpectedly high interrupt counts despite having coalescing enabled, check whether your coalescing timers are set too low for your hardware’s interrupt processing speed.