Graphics cards are measured by a combination of specs that describe their processing power, speed, memory, power draw, and real-world performance. No single number tells the whole story. Understanding what each metric actually means helps you compare cards and figure out which one matches your needs.
Processing Cores: CUDA Cores and Stream Processors
The most fundamental measure of a GPU’s power is its core count, but the terminology depends on the manufacturer. NVIDIA calls its parallel processing units CUDA cores, while AMD calls them stream processors. Both serve the same basic purpose: they divide rendering tasks into thousands of tiny pieces and process them simultaneously.
A common mistake is comparing these numbers directly between brands. A card with 4,000 CUDA cores is not equivalent to a card with 4,000 stream processors. The two architectures are fundamentally different at a hardware level, so a 1:1 comparison doesn’t work. AMD cards tend to have higher core counts than similarly priced NVIDIA cards, but that doesn’t automatically mean more performance. Core count is useful for comparing cards within the same brand and generation, not across them.
Clock Speed: Base, Boost, and Game Clocks
Clock speed, measured in megahertz (MHz) or gigahertz (GHz), tells you how many cycles a GPU completes per second. A higher clock speed means the card can crunch through more calculations in the same amount of time, all else being equal.
You’ll typically see two or three clock speed ratings on a spec sheet. The base clock is the minimum guaranteed speed the GPU runs at under load. The boost clock is the maximum speed the card can hit when temperatures and power delivery allow it. AMD also uses a “game clock,” which represents the average speed you can expect during typical gaming rather than the absolute peak. In practice, most modern GPUs spend the majority of their time running near or at their boost clock, throttling down only when they get too hot or hit a power limit.
TFLOPS: Theoretical Computing Power
TFLOPS, or teraflops, measures how many trillion mathematical operations a GPU can perform per second on decimal numbers. It’s calculated from the core count and clock speed, and it gives you a rough sense of a card’s raw computational muscle. A card rated at 25 TFLOPS is theoretically capable of 25 trillion calculations per second.
The key word is “theoretically.” TFLOPS ratings represent an upper limit that real applications almost never reach. The biggest bottleneck is usually memory bandwidth, meaning the GPU can calculate faster than data can be fed to it. Real workloads also include overhead like thread management, memory allocation, and program control flow that eat into raw throughput. Think of it like a car’s top speed: a 200 mph rating tells you something about the engine, but you’ll never hit that number in traffic. TFLOPS is most useful for comparing cards within the same architecture, where the gap between theoretical and actual performance scales proportionally.
Texture Units and Render Output Units
Two lesser-known specs appear on detailed spec sheets: TMUs (texture mapping units) and ROPs (render output units). TMUs handle wrapping textures onto 3D objects. More TMUs generally means faster texture processing, which matters in visually complex scenes. ROPs assemble the final image that gets sent to your display and handle tasks like antialiasing. More ROPs help at higher resolutions where the card needs to output more pixels per frame.
These numbers rarely make or break a purchasing decision on their own, but they explain why two cards with similar core counts can perform differently. A card with a low TMU-to-core ratio can bottleneck on texture-heavy scenes, while a card with fewer ROPs may struggle more at 4K than its core count would suggest.
Specialized Cores: RT and Tensor
Modern NVIDIA GPUs include two types of specialized hardware beyond their standard CUDA cores. RT cores accelerate ray tracing, which simulates how light bounces off surfaces to produce realistic reflections, shadows, and lighting. Without dedicated RT cores, ray tracing calculations would fall entirely on the general-purpose cores, dragging performance down dramatically.
Tensor cores handle AI-related tasks, specifically matrix math. Their most visible application in gaming is DLSS, NVIDIA’s AI upscaling technology, which uses tensor cores to reconstruct a high-resolution image from a lower-resolution render. In professional workloads like machine learning, tensor cores can achieve orders of magnitude better performance on matrix operations compared to standard CUDA cores alone. AMD has its own ray tracing hardware in RDNA-series cards, though it doesn’t use a separate “tensor core” branding for AI acceleration.
Memory: VRAM Amount and Bandwidth
VRAM (video memory) stores textures, frame buffers, and other data the GPU needs quick access to. It’s measured in gigabytes, and the amount you need depends on your resolution and the games or applications you run. At 1080p, 8 GB is comfortable for most games. At 4K with high-resolution texture packs, 12 GB or more becomes important.
Just as critical is memory bandwidth, measured in GB/s, which describes how fast data can move between VRAM and the GPU’s processing cores. A card with plenty of VRAM but low bandwidth can still bottleneck, because the cores sit idle waiting for data. This is why the type of memory matters too: GDDR6X is faster than GDDR6, and both are significantly faster than older GDDR5.
Power Draw: TDP, TGP, and TBP
Power consumption is measured in watts, but manufacturers use slightly different terminology. NVIDIA uses TGP (Total Graphics Power), which represents the total power the entire graphics card draws, not just the chip itself. AMD uses TBP (Total Board Power), which means essentially the same thing. The older term TDP (Thermal Design Power) refers specifically to how much heat the cooling system needs to handle, and it typically describes the chip alone rather than the full card.
For practical purposes, the TGP or TBP number tells you what your power supply needs to deliver to the card. A card rated at 300W TGP needs a power supply with enough overhead to handle that draw alongside the rest of your system. Power ratings also give you a sense of how much heat the card generates, which affects noise levels from the cooling fans.
PCIe Interface Bandwidth
Graphics cards connect to your motherboard through a PCIe slot, and the generation of that connection determines maximum data transfer speeds. PCIe 3.0 transfers data at 8 gigatransfers per second per lane, PCIe 4.0 doubles that to 16 GT/s, and PCIe 5.0 doubles again to 32 GT/s. Most current GPUs use a 16-lane connection, so total available bandwidth is the per-lane rate multiplied by 16.
In practice, very few current games saturate even a PCIe 3.0 x16 connection, so the generation of your PCIe slot rarely limits gaming performance. It matters more for professional workloads that move large datasets between system memory and the GPU.
Real-World Benchmarks: FPS and 1% Lows
All the specs above describe what a card is capable of on paper. What actually matters to most people is how many frames per second (FPS) it delivers in the games or applications they use. That’s where benchmarks come in.
Average FPS is the primary indicator of overall performance and the number you’ll see most prominently in reviews. But averages can hide stuttering. A card averaging 90 FPS that occasionally dips to 30 FPS will feel much worse than one averaging 80 FPS that never drops below 65. That’s why reviewers also report the 1% low, which captures the worst-case frame rate by showing the performance floor during the roughest moments. When two cards have similar averages, the one with higher 1% lows will feel smoother in actual gameplay.
If you’re comparing two cards and can only look at one metric, skip the spec sheets entirely and go straight to benchmark results at your target resolution. A card with fewer cores and lower clock speeds can outperform a card with better-looking specs if its architecture is more efficient, its drivers are better optimized, or its memory subsystem is faster. Specs tell you what’s under the hood. Benchmarks tell you how fast it actually goes.

