The METLIN database is a massive, publicly accessible repository of small molecule data, serving as a spectral library for researchers worldwide. Developed and maintained by the Siuzdak laboratory at Scripps Research, METLIN was created to overcome bottlenecks in metabolite identification. Its fundamental purpose is to archive, visualize, and analyze data on metabolites, which are the thousands of small chemical entities that drive all biological processes. By providing a centralized, high-quality resource, METLIN enables scientists to accurately identify unknown compounds detected in biological samples like blood, urine, or tissue. This capability transforms raw laboratory data into meaningful biological information, accelerating discovery across medicine and biology.
The Field of Metabolomics
Metabolomics is the large-scale study of the entire complement of small molecules, known as the metabolome, within a cell, tissue, or organism. While genomics examines the organism’s genetic blueprint and proteomics studies the proteins, metabolomics focuses on the final functional output of all cellular activity. Metabolites, such as sugars, amino acids, lipids, and vitamins, are the end products of these processes and provide a direct snapshot of the organism’s current physiological state.
This functional readout makes the metabolome a sensitive barometer for health, disease, and environmental influences. The chemical space is vast, encompassing molecules created by the body (endogenous metabolites) and those introduced from external sources, such as drugs (xenobiotics) and compounds derived from food and the gut microbiome. Profiling these molecules can reveal subtle changes that indicate the onset of disease or a response to a new drug, requiring a comprehensive database to interpret the complex chemical signatures found in biological samples.
Data Stored in METLIN
The core value of METLIN lies in its extensive catalog of identification data for hundreds of thousands of metabolites and other chemical entities, including drugs and their breakdown products. The most distinctive feature is the inclusion of high-resolution tandem mass spectrometry (MS/MS) data, which acts as a unique chemical fingerprint for each compound. This MS/MS spectrum is the pattern of smaller, predictable fragments resulting when a molecule is analyzed by a mass spectrometer.
METLIN systematically acquires this spectral data from pure, commercially available standard compounds, ensuring the highest quality reference available. The fragmentation spectra are generated at multiple collision energies (such as 0, 10, 20, and 40 volts) and in both positive and negative ion modes. This multi-dimensional approach provides researchers with a robust spectral signature that is more reliable for identification than simply matching a compound’s mass. Each entry is also annotated with chemical identifiers, such as CAS Registry Numbers, and structural information.
How METLIN Advances Discovery
The ability to accurately and rapidly identify metabolites using METLIN’s reference spectra accelerates scientific discovery across several fields. One major application is in Biomarker Identification, where researchers search for unique metabolic signatures associated with a specific disease state. For instance, a scientist studying cardiovascular disease might identify an unusual pattern of lipids; matching the MS/MS spectra to the METLIN library confirms the identity of the compounds, transforming an unknown chemical signal into a potential diagnostic marker.
In Drug Discovery, METLIN provides a means to understand how new pharmaceutical compounds are processed by the body. Researchers must identify the specific metabolites the body produces after a drug is ingested to understand its efficacy and toxicity. By searching experimental fragmentation patterns against the database, scientists can quickly characterize the drug’s metabolic pathway and identify any potentially harmful breakdown products. This speeds up the screening and validation process for new drug candidates.
The database is also an important tool in Personalized Medicine, which aims to tailor treatments to an individual’s unique biology. A patient’s metabolic profile is highly individual, influenced by genetics, diet, and lifestyle, and these differences can affect how they respond to medication. By analyzing a patient’s unique metabolic data against the comprehensive library, clinicians can predict whether a specific treatment will be effective or if a patient is at risk for adverse side effects, leading to more precise and individualized care.
Why Centralized Data is Critical
Metabolomics faces a challenge because the “chemical space” is immense, with estimates suggesting hundreds of thousands of different metabolites may exist in the human body, including those derived from the environment and diet. When an experiment detects a signal for a previously uncharacterized molecule, identifying its structure without a reference is a time-consuming and often impossible task. METLIN solves this problem by providing a standardized reference library.
The database offers a central, high-quality resource that allows researchers globally to compare their experimental data to standardized, experimentally derived spectral fingerprints. This standardization ensures that a metabolite identified in one laboratory can be reliably confirmed by another, improving the reproducibility of scientific findings. By eliminating the need for every laboratory to individually generate and validate spectral data, METLIN reduces the bottleneck of identification, accelerating the pace of untargeted metabolomics research.

