What Types of Data Can Be Collected in an Experiment?

Experiments can collect two broad categories of data: quantitative data (numbers) and qualitative data (descriptions). Within those categories, the specific types range from simple counts and measurements to behavioral observations, physiological readings, survey responses, and digital logs. The type you collect depends on what you’re measuring, how precisely you need to measure it, and what questions you’re trying to answer.

Quantitative vs. Qualitative Data

Quantitative data is anything expressed as a number. Think of a thermometer reading, a test score, the number of cells under a microscope, or the time it takes someone to press a button. This type of data is well suited for establishing cause-and-effect relationships, testing hypotheses, and drawing conclusions that generalize to larger populations. It produces factual, reliable outcomes that you can analyze with statistics.

Qualitative data is expressed in words, images, or descriptions rather than numbers. In an experiment, this might look like a researcher’s written notes about how participants behaved, transcripts from interviews about a participant’s experience, or descriptions of color changes in a chemical reaction. Qualitative data captures meanings, experiences, and perspectives that numbers alone can miss. It’s especially useful for understanding processes, like how people make decisions or why they respond to a stimulus in a particular way.

Most well-designed experiments collect both. A psychology experiment might record how many seconds it takes a participant to solve a puzzle (quantitative) alongside their verbal description of their strategy (qualitative).

Discrete and Continuous Data

Quantitative data splits into two further types based on how it’s measured. Discrete data comes from counting. It takes on specific, separated whole numbers with meaningful gaps between them. The number of errors a participant makes, the number of plants that germinated, the number of times a rat presses a lever: these are all discrete. You can’t have 3.7 errors or 12.4 germinated seeds.

Continuous data comes from measuring on a scale where, in theory, any value is possible. Height, weight, temperature, reaction time, blood pressure, and duration all fall here. A participant’s reaction time might be 342 milliseconds, 342.7 milliseconds, or 342.71 milliseconds. The precision is limited only by your instrument, not by the phenomenon itself. Continuous data generally carries more information per data point, which is why researchers often prefer measurement over simple counting when both options exist.

The Four Levels of Measurement

Beyond the discrete/continuous split, data also varies in how much mathematical meaning it carries. Researchers describe this using four levels of measurement, ranked from least to most informative.

Nominal: Data that can only be placed into categories with no inherent order. Examples include blood type, eye color, species of organism, or which experimental group a participant was assigned to. You can count how many fall into each category, but you can’t rank or average them.
Ordinal: Data that can be categorized and ranked, but the gaps between ranks aren’t necessarily equal. A pain rating of “mild, moderate, severe” is ordinal. So are Likert scale responses like “strongly disagree” through “strongly agree.” You know that severe is worse than moderate, but you can’t say the difference between mild and moderate is the same size as the difference between moderate and severe.
Interval: Data that can be ranked with equal spacing between values, but has no true zero point. Temperature in Celsius is the classic example: the difference between 20°C and 30°C is the same as between 30°C and 40°C, but 0°C doesn’t mean “no temperature.” IQ scores and standardized test scores also fall here.
Ratio: Data with equal intervals and a meaningful zero. Height, weight, age, reaction time, and concentration of a chemical solution are all ratio data. Zero means the complete absence of the thing being measured, and you can make meaningful statements like “twice as heavy” or “half as fast.”

The level of measurement matters because it determines which statistical tests you can run. Ratio and interval data open the door to the most powerful analyses, while nominal data limits you to frequency counts and proportions.

Physiological and Biological Data

In biology, medicine, and health sciences, experiments often collect data directly from the body. Physiological measurements include blood pressure, heart rate, body temperature, and respiratory rate. Molecular measurements go deeper: cholesterol levels, blood cell counts, hormone concentrations, blood glucose, or the expression level of a specific gene. Imaging data, such as MRI scans or X-rays, captures structural information like tumor size or bone density.

These data points are often collected using automated sensors and instruments. A pulse oximeter clips onto a finger and records oxygen saturation continuously. An electroencephalogram tracks electrical activity in the brain. The instruments convert biological phenomena into numerical data that can be compared across experimental conditions or tracked over time.

Behavioral and Psychological Data

Behavioral experiments typically focus on two core measurements: how quickly someone responds and how accurately they respond. Reaction time measures the speed of a decision, usually in milliseconds. Choice accuracy measures how often participants pick the correct option. Together, these two data types reveal how difficult a task is and how people trade off speed for precision.

Beyond reaction time and accuracy, behavioral researchers also collect frequency data (how many times a behavior occurs), duration data (how long a behavior lasts), and intensity ratings. Self-report data is common too. Participants might rate their mood, anxiety, confidence, or pain on a numerical scale, often a 7-point scale ranging from “not at all” to “a great deal.” These scales convert subjective experience into quantifiable data, though they carry the limitation of relying on the participant’s own perception.

Observational data fills in what self-reports and automated measures can miss. A researcher might code video recordings for facial expressions, body language, or social interactions, turning qualitative observations into structured, countable categories.

Survey and Self-Report Data

Surveys are one of the most common data collection tools across disciplines. They can use structured questions with fixed response options (producing quantitative data), semi-structured questions that allow some open-ended responses, or fully open-ended questions (producing qualitative data). Surveys can be administered by mail, phone, email, website, or in person, either individually or in groups.

The data that comes from surveys spans multiple levels of measurement. A question asking participants to select their city of residence yields nominal data. A satisfaction rating from 1 to 5 yields ordinal data. A question asking for exact annual income yields ratio data. Researchers often collect the same variable at different levels of precision depending on their needs. Income, for instance, can be collected as exact figures (ratio) or as brackets like $0–$19,999 and $20,000–$39,999 (ordinal). The more precise version gives you more analytical power, but participants may be more willing to answer the bracketed version.

Digital and Computational Data

Experiments conducted with computers or digital systems generate their own category of data. Event logs record timestamped sequences of actions: what happened, when it started, when it ended, and who or what performed it. In a user experience experiment, this might mean logging every click, scroll, and page transition. In a computational simulation, it means recording the output of each run, including processing times, resource usage, and performance metrics.

Simulations generate synthetic data by running a model many times with slightly different parameters. The resulting data can include average cycle times, waiting times, branching probabilities, and distributions of outcomes. Machine learning experiments track metrics like prediction accuracy, error rates, and how those change as the model trains. All of this data is quantitative and often continuous, collected automatically at a scale and speed that would be impossible with manual observation.

Primary vs. Secondary Data

One final distinction worth understanding is where the data comes from. Primary data is information you collect firsthand for your specific experiment. You design the instrument, run the procedure, and record the results. This gives you full control over what’s measured and how.

Secondary data is information originally collected by someone else for a different purpose that you repurpose for your own analysis. Government health databases, published datasets, census records, and hospital admission logs all count. During the COVID-19 pandemic, researchers accessed state health department databases and federal surveillance systems to track infection patterns, rather than collecting all that data from scratch. Secondary data saves time and resources, but you’re limited by someone else’s choices about what to measure and how to measure it.

What Makes Experimental Data Useful

Collecting data isn’t enough. The data needs to be reliable and valid. Reliability means you get consistent results when you repeat the measurement. If you weigh the same sample three times and get three different numbers, your scale isn’t reliable. Reliability is assessed through methods like test-retest (measuring the same thing twice and comparing), split-halves (dividing a test in two and checking whether both halves give similar results), and internal consistency checks.

Validity means you’re actually measuring what you think you’re measuring. A thermometer is a valid tool for measuring temperature but not for measuring humidity, even though both involve the atmosphere. Validity is harder to establish than reliability, and it’s evaluated through content validation (does the measure cover the right territory?), criterion validation (does it correlate with other accepted measures of the same thing?), and construct validation (does it behave the way theory predicts it should?). Data that is reliable but not valid gives you consistent wrong answers. Data that is valid but not reliable gives you the right answer on average but with too much noise to be useful. Good experimental data is both.