Short Tandem Repeats (STRs), also known as microsatellites, are small, repeated sequences of DNA found scattered throughout the human genome. These genetic elements consist of a short sequence of base pairs duplicated immediately next to itself. Their high degree of variability between individuals makes them valuable for genetic analysis. By analyzing the number of times a sequence is repeated at specific genomic locations, scientists can generate a unique molecular signature for modern DNA profiling.
The Structure of Short Tandem Repeats
A Short Tandem Repeat consists of a core sequence of DNA bases, typically two to six base pairs long, repeated multiple times in a row. For instance, the sequence “AGAT” might be repeated ten, twelve, or fifteen times at a specific location. Most STRs are located in non-coding regions of the genome, meaning they do not contain instructions for building proteins. This location allows them to tolerate high variation in length without causing harm.
The entire repetitive stretch is generally less than 100 nucleotides long. This short length makes STR markers robust and easier to analyze from small or degraded DNA samples. The specific location of an STR on a chromosome is referred to as a locus. These loci are named using a standard nomenclature, such as D3S1358, which indicates a DNA marker on chromosome 3.
The Mechanism of Individual Variation
The utility of STRs for identification stems from their high degree of polymorphism, meaning the length of the repeat sequence varies widely among people. Every person inherits two copies of each chromosome, one from each parent, and thus possesses two versions, or alleles, for every autosomal STR locus. These alleles are defined by counting the exact number of times the core sequence is repeated at that specific location.
If an individual inherits alleles with ten repeats from one parent and twelve repeats from the other, their genotype at that locus is represented as “10, 12.” If both inherited alleles have the same number of repeats, the genotype is called homozygous, such as “12, 12.” The number of repeats changes frequently over generations through strand-slippage replication. This high mutation rate constantly generates new allele lengths in the population.
The process of inheritance follows Mendelian rules, requiring a child to receive one allele from each biological parent. By analyzing multiple unlinked STR loci across the genome, the combination of all inherited alleles creates a unique genetic profile. This overall profile serves as an individual’s personal genetic signature, making it highly improbable for two unrelated people to share the identical pattern.
STRs in Forensic Identification and Databases
The primary application of STR analysis is creating a DNA profile, or genetic fingerprint, for human identification. This process involves examining a standardized set of STR loci known to be highly variable within the population. In the United States, the Combined DNA Index System (CODIS) was initially built upon thirteen autosomal STR loci, though modern forensic kits analyze twenty or more loci for greater statistical power.
To build a profile, forensic scientists determine the exact number of repeats for the two alleles at each standardized locus. This set of numbers constitutes the DNA profile, which is then entered into national DNA databases, such as the one maintained by the Federal Bureau of Investigation (FBI). These databases house profiles from convicted offenders, crime scene evidence, and missing persons. Law enforcement can then compare a new profile against existing entries, where a match is called a “cold hit.”
The statistical power of STR profiling is derived from the multiplication rule of probability. Since the alleles at each locus are inherited independently, the frequency of the entire profile is calculated by multiplying the known population frequencies of the individual alleles. The probability of two unrelated individuals randomly sharing the same profile across twenty or more loci is astronomically small, often less than one in a quadrillion. This low probability provides a high degree of confidence in the identification.
The databases store only the numerical profile, representing the lengths of the STR fragments, along with administrative data. This design focuses strictly on identity and avoids storing information that could reveal personal traits or medical conditions. For crimes where a male perpetrator is suspected, analysts may also examine Y-chromosome STRs (Y-STRs). Y-STRs are passed down exclusively from father to son, useful for tracing male lineage.
Other Essential Uses of STR Analysis
Beyond criminal forensics, STR analysis is the standard method for establishing biological relationships, known as kinship testing. Paternity and maternity tests compare the STR profiles of the child and alleged parents to confirm the child inherited one allele from each parent. This application is also used in disaster victim identification and the identification of human remains. A sample from the deceased is compared to profiles of close family members to confirm identity.
Specific types of STRs, particularly Y-STRs, are also used in genetic ancestry tracing and genealogical research. While Y-STRs track the direct paternal line, other STR markers can be used to study ancient human migration patterns. This is achieved by comparing allele frequencies across different ethnic groups and geographic regions.
In clinical genetics, a specific class of STRs, particularly trinucleotide repeats, are studied as markers for certain inherited neurological disorders. The abnormal expansion of these repeats beyond a normal length causes several conditions, including Huntington’s disease and Fragile X syndrome. Analyzing the number of repeats at these specific loci allows clinicians to diagnose these conditions and assess transmission risk within a family.

