Species richness ($S$) is the simplest measure of biodiversity, representing the raw count of different species within a defined area or community. While counting species appears straightforward, calculating this metric in a real-world ecological setting is highly complex. Field sampling limitations mean that direct observation rarely captures the total number of species present, requiring sophisticated statistical methods to provide an accurate assessment. These techniques are designed to account for species that were missed during the sampling effort.
Defining Species Richness
The initial calculation of species richness begins with a direct census of a community to determine the number of species observed. This raw count is referred to as observed richness, or $S_{obs}$. For instance, a biologist might survey a forest plot and record 45 distinct tree species, making $S_{obs} = 45$.
$S_{obs}$ forms the foundational data set for ecological analyses and is the simplest way to quantify biodiversity in a specific location. The magnitude of this count is relevant to assessing habitat health, as a greater number of species often indicates a more complex and stable ecosystem. Conservation efforts frequently rely on $S_{obs}$ to identify high-priority areas, such as biodiversity hotspots.
The Challenge of Complete Counting
The primary obstacle in determining a community’s true richness ($S_{true}$) is the inherent limitation of fieldwork. Sampling efforts are constrained by time, budget, and accessibility, meaning the observed richness ($S_{obs}$) is almost always an underestimate. This disparity is pronounced for rare species, which may exist in low numbers or in hard-to-reach microhabitats.
Ecologists use the Species Accumulation Curve (SAC) to illustrate this challenge and assess sampling adequacy. This curve plots the cumulative number of new species recorded against the total number of individuals or samples collected. Initially, the curve rises steeply as many new species are discovered with each sample added.
As sampling continues, the slope of the curve gradually decreases, indicating that fewer new species are being found. An ideal curve would flatten out completely, reaching an asymptote that represents the point where nearly all species in the area have been recorded. However, many field studies conclude before this asymptote is fully reached, confirming that some rare species have been missed. This gap necessitates the use of statistical estimation.
Estimating True Richness
Ecologists rely on statistical models, known as non-parametric estimators, to predict the true richness ($S_{est}$). These models use the observed dataset to extrapolate the number of species that were present but remained undetected. The fundamental principle is that the frequency of rare species in the sample indicates how many species were missed entirely.
The most widely used approach focuses on species represented by only one individual, known as singletons ($f_1$), and those represented by exactly two individuals, called doubletons ($f_2$). If a sample contains many singletons relative to doubletons, it suggests that many more species exist that were not sampled even once. Conversely, if the number of singletons is low, the sampling effort was likely more thorough.
The Chao1 Index is a common estimator that quantifies this relationship, using the ratio of singletons to doubletons to add a predicted number of unseen species to the observed count. This index is applied to abundance data, where the number of individuals for each species is recorded.
When only incidence data is available—the presence or absence of a species in multiple samples—ecologists may use the Jackknife Estimator. The Jackknife method estimates richness by systematically removing one sample unit at a time and observing how many species drop out of the census. The first-order Jackknife estimator specifically uses the number of unique species found in only one sample to predict the total number of missed species. These estimators provide a statistically defensible number for $S_{true}$, which is necessary for making comparisons between ecological communities sampled with varying levels of effort.
Richness versus Diversity
While species richness provides a straightforward count of species, it is frequently confused with the broader concept of species diversity. Richness is only one component of diversity; true species diversity indices incorporate both the number of species and their relative abundance, known as evenness. The Shannon Index and the Simpson Index are two common indices used to calculate this more complex measure.
These diversity indices move beyond a simple tally by weighting the contribution of each species based on its abundance within the community. For example, a community with 10 species where all 10 are equally abundant is considered more diverse than a community also with 10 species, but where one species dominates 90% of the individuals. The Shannon and Simpson indices quantify this difference in evenness, resulting in a single value that reflects the distribution of individuals among species. Since richness is exclusively a count, these indices, while calculated from the same species data, serve a different purpose than merely estimating the total number of species.

