A model in research is a simplified representation of something in the real world, built to help researchers understand, explain, or predict how that thing works. Models come in many forms: a set of mathematical equations, a diagram showing how variables connect, a computer simulation, or even a living organism used to stand in for human biology. What ties them all together is that they deliberately leave out some details so researchers can focus on the parts that matter most for their question.
The word “model” gets used so broadly across scientific fields that it can feel vague. A biologist talking about a “mouse model” means something completely different from a sociologist describing a “theoretical model.” Understanding the main types and what they actually do clears up most of the confusion.
What Makes Something a Model
Every scientific model is a stand-in for a real system. It represents selected parts or aspects of that system, which researchers call the “target.” A climate model represents Earth’s atmosphere. A regression model represents the statistical relationship between variables in a dataset. A fruit fly represents certain genetic processes shared with humans. In each case, the model strips away complexity to make the target system easier to study, test, or communicate about.
Models serve several cognitive functions. They help researchers learn about the target system by generating predictions that can be checked against reality. They provide explanations for why something happens, not just descriptions of what happens. And they build understanding by giving people a mental structure they can reason with. A diagram of how a virus spreads through a population, for instance, lets public health officials think through “what if” scenarios in ways that raw data alone cannot.
Theoretical and Conceptual Models
In social sciences, education, and health research, you’ll often see the terms “theoretical framework” and “conceptual framework” used almost interchangeably, but they do different things. A theoretical framework offers a way to explain and interpret the phenomenon being studied. It draws on established theory to say why something happens. A conceptual framework, by contrast, clarifies the assumptions a researcher is making about the phenomenon and maps out which variables matter and how they relate.
Think of a theoretical model as borrowing an existing lens (say, a psychological theory of motivation) to look at your research question. A conceptual model is more like drawing your own map of the specific variables and connections you plan to study. Many research papers use both: a theoretical model to ground the work in existing knowledge, and a conceptual model to show exactly what this particular study will examine.
Mathematical and Statistical Models
Mathematical models use equations to describe how a system behaves. They’re common in epidemiology, economics, engineering, and ecology, and they range from simple formulas to enormous computer simulations. Several useful distinctions help explain what different mathematical models do.
- Mechanistic vs. phenomenological. A mechanistic model spells out the actual processes driving a system. A compartmental model of influenza transmission, for example, tracks how people move from susceptible to infected to recovered and calculates how vaccination changes those flows. A phenomenological model skips the underlying mechanism and simply fits a curve to observed data, like estimating HIV prevalence trends from surveillance numbers.
- Predictive vs. descriptive. Predictive models forecast future events, such as projecting the impact of a malaria vaccine over the next decade. Descriptive models explain what already happened, like quantifying how malaria control efforts in Africa reduced cases between 2000 and 2015.
- Theory-driven vs. data-driven. Theory-driven models start with assumptions and explore hypothetical scenarios (what would happen if every person with HIV received immediate treatment?). Data-driven models let the data lead, using observed patterns to estimate things like how effective past vaccination programs actually were.
Statistical models are a class of mathematical model that relate variables to each other through a statistical framework. Regression models are the most common example: they estimate the association between a predictor (like hours of exercise per week) and an outcome (like blood pressure). More complex statistical approaches, like structural equation modeling, can handle dozens of variables at once and even measure things that aren’t directly observable, like “socioeconomic status” or “job satisfaction,” by combining multiple survey questions into a single latent variable.
Contemporary research frequently blends mathematical and statistical modeling. A team might build a mechanistic model of disease transmission and then use statistical methods to calibrate it against real-world data, combining theoretical structure with empirical grounding.
Biological and Animal Models
In biomedical research, a “model” often refers to a living system used to study processes that would be difficult or unethical to study directly in humans. Animal models are based on the principle of comparative medicine: because many species share biological pathways with humans, researchers can use them to understand disease, test drugs and vaccines, and develop surgical techniques.
These models don’t always involve whole animals. Researchers also work with isolated cells, tissues, organs, or specific genes that replicate a particular disease process. During the COVID-19 pandemic, for instance, in vivo (live animal) studies helped scientists untangle how the virus caused disease, how the immune system responded, and what side effects proposed vaccines might produce, all before human trials began.
Non-animal alternatives have been growing rapidly. Cell cultures, 3D tissue models, organs-on-chips (tiny devices that mimic the function of a human organ), computer simulations, and stem cell research are all increasingly used alongside or in place of animal models, especially in early-stage research.
How Models Are Built
Model development generally follows three broad phases, regardless of the field. The first phase involves gathering the raw material: reviewing existing models and frameworks, searching the literature, consulting experts, and sometimes conducting interviews or surveys to understand the problem. The second phase is construction, where the researcher synthesizes all that information, extracts key themes or variables, and organizes them into a coherent structure. This is typically iterative, meaning the model goes through multiple rounds of feedback and revision with collaborators and subject-matter experts before anyone considers it ready.
The third phase is evaluation and refinement. Researchers test the model against real data or pilot it through case studies. In some fields, expert panels review the model’s structure and assumptions. The results of this testing feed back into revisions, and the cycle continues until the model performs well enough for its intended purpose. “Well enough” is the key phrase here, because no model is expected to be perfect.
Why All Models Are “Wrong”
The statistician George Box famously wrote that “all models are wrong, but some are useful.” The phrase has become a kind of mantra in science, though it’s worth understanding what it actually means, and where it breaks down.
Box’s point was that no simple model can exactly represent the full complexity of a real-world system. A city map doesn’t show the cracks in the sidewalk or the precise width of every street. But that doesn’t make the map useless. It answers the questions it was designed to answer: how to get from one place to another. A model’s value lies not in being a perfect replica of reality but in how well it helps you think, predict, or decide.
The practical danger, as the statistician Andrew Gelman has noted, is that many people don’t internalize this lesson. They treat model outputs as exact truths rather than useful approximations. Someone might say there’s a 74% probability that a particular statistical model is “correct,” which misses the point entirely. The goal isn’t to prove a model right or wrong. It’s to assess how well the model fits the data and whether it’s useful for the decisions at hand. A model that’s technically “wrong” in its details can still be enormously valuable if it captures the patterns that matter.
Parsimony, the idea that simpler models are generally preferred over unnecessarily complex ones, plays a role here too. Box placed his famous statement directly after a section arguing for simplicity in model building. A model that tries to include every possible variable and mechanism often ends up harder to use, harder to test, and no better at prediction than a leaner version that focuses on the key drivers.

