What Is Scanner Data? Uses, Sources, and Limits

Scanner data is the detailed transaction information captured every time a product’s barcode is scanned at a retail checkout. Each record typically includes the product code, the quantity sold, and the total revenue from those sales, allowing an average price per item to be calculated. This data is collected automatically and continuously, making it one of the most comprehensive sources of information about what consumers buy and how much they pay.

Originally a tool for retailers tracking inventory and sales, scanner data has become essential for economists measuring inflation, researchers studying food purchasing habits, and businesses analyzing market trends.

What a Scanner Data Record Contains

At its core, a scanner data record is built around a product identifier. The most common is the Global Trade Item Number (GTIN), the standardized barcode printed on packaged goods. For items without a standard barcode, like loose produce, stores use price look-up codes (PLUs) or their own internal stock-keeping units (SKUs).

Alongside the product code, each record includes the turnover (total sales revenue) and quantity sold for a given time period and store location. From those two numbers, a unit value price is derived: essentially the average price consumers actually paid per item. Records also carry a product description, brand name, package size, unit of quantity, applicable tax rates, and sometimes a retailer’s own product classification. Discount and promotion information may be included as well, though how thoroughly varies by retailer.

This level of detail means scanner data captures real purchasing behavior, not just listed shelf prices. If a product was on a multi-buy offer or discounted through a loyalty card, the lower price consumers actually paid shows up in the average. Products returned for a refund are excluded, since they were never truly consumed.

Retail Scanner Data vs. Household Panel Data

There are two distinct types of scanner data, and they answer different questions.

Retail-based scanner data comes directly from store registers. Companies like Circana (formerly IRI) aggregate weekly revenues and quantities for each product code sold by store, covering major grocery chains, drugstores, and mass merchandisers. This data tells you exactly what moved off shelves and at what price, but it doesn’t tell you anything about the person who bought it.

Household-based scanner data flips that around. A nationally representative panel of more than 120,000 households uses handheld scanners or mobile apps to record every food product they purchase and where they shop. Because it’s tied to individual households, this data includes demographic information like income, household size, and location. A subset of panelists also reports health information and prescription drug purchases. The tradeoff is that it depends on participants consistently scanning their own purchases, which introduces some human error.

Researchers studying pricing trends or market share lean on retail scanner data. Those studying how diet varies by income, or how a policy change affects different demographics, rely on the household panels.

How Scanner Data Measures Inflation

National statistics offices increasingly use scanner data to calculate the Consumer Price Index (CPI) and related inflation measures. The UK’s Office for National Statistics, for example, is introducing grocery scanner data into its consumer price inflation statistics in March 2026, replacing traditional price collection with approximately 300 million price points derived from over a billion units of products sold per month.

That shift is massive. Traditional price collection involves field agents visiting stores and manually recording the price of selected items on a specific day. Scanner data, by contrast, reflects the average price paid over several weeks and covers every product sold, not just a sample. It also captures actual spending volumes, so the index can weight products by how much consumers actually spend on them rather than treating every item equally.

The statistical method preferred for this work is called GEKS-Törnqvist, a multilateral index approach. In plain terms, it handles a problem unique to scanner data: products constantly appear and disappear from store shelves. New flavors launch, seasonal items rotate in and out, and products get discontinued. The method compares all possible price change calculations across a 25-month window and averages them, which prevents the index from being distorted by this constant churn.

Scanner data also picks up mid-year discounts automatically. Loyalty card promotions, multi-buy offers, and temporary price reductions all show up as decreases in the average price consumers paid, giving a more accurate picture of real-world inflation than a single shelf-price snapshot would.

What Scanner Data Misses

Scanner data has significant blind spots. It only covers products with barcodes sold through retailers that use electronic point-of-sale systems. That means it misses farmers’ markets, street vendors, small independent shops without scanning technology, and entire categories of spending like services, rent, or restaurant meals. The U.S. Bureau of Labor Statistics has noted that scanner data does not cover the full universe of items in the CPI, so it needs to be combined with traditional manual price collection for a complete picture.

Fresh items sold by weight, like deli meat or bakery goods priced in-store, are harder to capture cleanly because they often lack standardized barcodes. Online-only retailers have historically been underrepresented, though this is changing as major grocers integrate their in-store and online sales data.

There are also coverage gaps by geography and retailer. Not every chain shares its data with statistics offices or market research firms, and some discount retailers or warehouse clubs have historically opted out of data-sharing agreements. The data you can access depends heavily on which retailers participate.

Who Uses Scanner Data and Why

Government agencies are among the largest users, relying on scanner data to produce more accurate and timely inflation statistics. But the commercial applications are just as significant. Consumer packaged goods companies use retail scanner data to track their market share week by week, measure the effectiveness of promotions, and monitor competitor pricing. A brand can see exactly how a price increase affected unit sales at specific retailers within days.

Academic researchers in economics, public health, and nutrition use both retail and household panel data to study questions like how soda taxes affect purchasing, whether food deserts lead to worse diets, or how inflation hits low-income households differently. The combination of granular product-level detail and (in panel data) household demographics makes scanner data unusually powerful for this kind of research.

Retailers themselves use their own scanner data for inventory management, shelf-space optimization, and demand forecasting. Knowing exactly how many units of each product sell per day at each location allows automated reordering systems to keep shelves stocked without overordering.

How the Technology Is Evolving

The barcode-and-laser-scanner system that generates most scanner data today is decades old, but the ecosystem around it is changing. RFID tags, which can be read without a direct line of sight, are increasingly used alongside traditional barcodes, particularly for inventory tracking. AI-powered computer vision systems can identify products without any barcode at all, which could eventually bring unpackaged and fresh items into the scanner data universe. These technologies are being integrated with real-time analytics platforms, moving retail data from a record of what happened last week to a live feed of what’s happening now.