A Manhattan plot is a specialized scatter plot used in genetics to visually represent findings from large-scale studies. Its primary function is to display the relationship between specific locations across the entire human genetic code and a particular biological characteristic or disease. The plot organizes an immense number of test results into a single graphic that reveals potential genetic associations. This visualization allows researchers to quickly identify regions of the genetic code statistically related to the trait under investigation.
Why It Is Called a Manhattan Plot
The unique name of the plot is purely descriptive, stemming from its distinct visual appearance, which resembles the iconic skyline of Manhattan, New York. When the data points are plotted, statistically significant genetic markers shoot upward, creating dramatic vertical spikes. These tall, isolated spikes stand in stark contrast to the vast majority of points clustered near the bottom of the graph, mimicking the way skyscrapers tower over smaller buildings. This visual metaphor immediately draws the eye to the most important results, separating them from millions of non-significant findings.
Understanding the Plot’s Anatomy
Reading a Manhattan plot requires understanding the specific information encoded on both the horizontal and vertical axes.
The X-Axis (Horizontal)
The X-axis represents the entire genome, laid out sequentially. It is segmented by chromosome, typically displaying all 22 pairs of non-sex chromosomes (autosomes) and sometimes the X and Y sex chromosomes. Researchers commonly use alternating colors for the data points of adjacent chromosomes, creating visually distinct blocks. Within each chromosome block, data points are positioned according to their physical location, or genomic coordinate. Each individual data point signifies a single genetic marker, most often a Single Nucleotide Polymorphism (SNP), which is a variation at a single position in the DNA sequence.
The Y-Axis (Vertical)
The Y-axis measures the strength of the association between each genetic marker and the trait being studied. This axis uses a specific mathematical transformation: the negative logarithm of the \(P\)-value, written as \(-log_{10}(P)\). The \(P\)-value is a statistical measure representing the probability that the observed association occurred purely by random chance. Since a very small \(P\)-value, such as \(10^{-10}\), indicates a strong association, transforming it with \(-log_{10}\) turns it into a large, positive number (10). Consequently, the higher a data point rises on the Y-axis, the stronger the statistical evidence for its association with the trait.
Using the Plot to Find Genetic Links
The Manhattan plot serves as the standard visual output for Genome-Wide Association Studies (GWAS), which are large-scale investigations that scan the entire genetic code for markers associated with a disease or characteristic. The primary step in interpreting the plot is identifying the significance threshold, represented by a horizontal line drawn across the graph. This line establishes the boundary between associations considered statistically meaningful and those likely to be random noise.
The most commonly accepted threshold for genome-wide significance is a \(P\)-value of \(5 times 10^{-8}\), which corresponds to a \(-log_{10}(P)\) value of 7.3. This stringent boundary is necessary because a GWAS performs millions of separate statistical tests, requiring a correction to account for the high chance of finding false positive results. Any data point that rises above this threshold line is considered a statistically significant association, suggesting the genetic marker is genuinely linked to the trait.
When multiple significant markers cluster together and rise above the threshold, they form a distinct “peak” on the plot. A peak signifies a specific region of the genetic code, or locus, that is likely to contain the actual gene or genes influencing the trait. The highest points within these peaks represent the markers with the strongest statistical association, providing researchers with a narrow focus for further investigation.

