A random sample is a subset of a larger group (called a population) where every member has an equal chance of being selected. In mathematical terms, it’s a collection of independently chosen observations that all follow the same probability distribution. This concept is the foundation of statistics, because it allows you to draw conclusions about an entire population by studying just a portion of it.
The Formal Definition
In math and statistics, a random sample of size n is a set of n independently selected values that all come from the same probability distribution. “Independently selected” means that choosing one value doesn’t affect which other values get chosen. “Same probability distribution” means every observation is drawn under identical conditions, so no part of the population is favored over another.
Think of it this way: if you wanted to know the average height of students at a school with 1,000 students, you could measure all 1,000. Or you could randomly select 50 students, measure them, and use that smaller group to estimate the school’s average. For this to work, every student needs the same likelihood of being picked, and picking one student can’t influence whether another gets picked. That’s a random sample.
With Replacement vs. Without Replacement
There are two ways to draw a random sample. Sampling with replacement means each selected item goes back into the pool before the next selection, so the same item could theoretically be chosen more than once. Each draw is completely independent of the others. Sampling without replacement means once something is selected, it’s removed from the pool. Every selected item is unique, but the draws are no longer fully independent since removing one item slightly changes the odds for the remaining ones.
In practice, most real-world sampling is done without replacement (you wouldn’t survey the same person twice). When the population is large relative to the sample size, the difference between the two methods becomes negligible, and statisticians treat them as roughly equivalent.
Types of Random Sampling
Simple random sampling is the most straightforward version: assign every member of the population a number, then use a random process to pick your sample. But there are several variations designed for different situations.
- Systematic sampling involves choosing every kth item from a list after a random starting point. You calculate k by dividing the population size (N) by your desired sample size (n). So if you have 10,000 people and want a sample of 200, you’d select every 50th person.
- Stratified sampling splits the population into groups that share a characteristic (like age brackets or income levels), then randomly selects some individuals from each group. This guarantees every subgroup is represented.
- Cluster sampling divides the population into naturally occurring groups (like neighborhoods or classrooms), then randomly selects entire groups and includes everyone in them. Unlike stratified sampling, each cluster contains a mix of different types of people.
All of these are probability sampling methods, meaning every member of the population has a known, nonzero chance of being selected. That’s what separates them from convenience samples or volunteer-based samples, which are prone to bias.
Why Randomness Matters
Random sampling exists to prevent bias, which in statistics means the systematic favoring of certain outcomes. Without randomness, samples tend to over-represent some groups and under-represent others. Surveying people at one specific store on a Tuesday morning, for example, misses anyone who works during the day. Recruiting volunteers skews toward people who are already motivated or interested in the topic. Advertising a survey on a single social media platform limits responses to users of that platform.
A properly random sample avoids all of this by giving every member of the population the same shot at being included. The result is a sample that looks like a miniature version of the whole population, which is what makes it useful for drawing broader conclusions.
The Math Behind Random Samples
Two major theorems in mathematics explain why random samples work so well.
The Law of Large Numbers says that as you increase the size of a random sample, the sample average gets closer and closer to the true population average. If you flip a fair coin 10 times, you might get 7 heads. Flip it 10,000 times, and you’ll land very close to 50% heads. The more data you collect randomly, the more accurate your estimate becomes.
The Central Limit Theorem takes this further. It states that if you take many random samples from any population and calculate each sample’s average, those averages will form a bell curve (normal distribution), regardless of what the original population looks like. This holds as long as the sample size is sufficiently large, typically 30 or more. This is why so much of statistics relies on the normal distribution: random sampling essentially creates it.
Sample Size and Precision
A random sample’s usefulness depends heavily on how large it is. The margin of error, which measures how far your sample estimate might be from the true population value, shrinks as the sample size grows. Specifically, the margin of error is inversely proportional to the square root of the sample size. That means quadrupling your sample size cuts the margin of error in half.
This is why polls and surveys always report a margin of error alongside their results. A national poll of 1,000 randomly selected people typically has a margin of error around 3 percentage points. To cut that to 1.5 points, you’d need roughly 4,000 people. There are diminishing returns: each additional reduction in error requires a disproportionately larger sample.
How Random Samples Are Selected
In practice, selecting a truly random sample requires a deliberate process. The most common methods include random number generators (built into calculators, spreadsheets, and programming languages) and random number tables, which are printed grids of digits with no pattern. To use either method, you assign a number to every member of the population, then let the random process pick which numbers make the cut.
The key steps are: define your population clearly, decide on your sample size, assign each member a unique identifier, and use a genuinely random mechanism to select. Skipping any of these steps, like selecting “at random” by gut feeling or picking the first people who respond, introduces bias and breaks the mathematical guarantees that make random samples powerful.

