How to Report Mass Spec Data in Research Papers

Reporting mass spectrometry data properly means documenting every step from sample preparation to data analysis with enough detail that someone else could evaluate or reproduce your work. The specific requirements vary depending on whether you’re working in proteomics, metabolomics, or small molecule characterization, but the core principles are the same: describe your instrument parameters, explain how you processed and identified your analytes, report your statistical thresholds, and deposit your raw data in a public repository.

What Every MS Methods Section Needs

Regardless of your application, your methods section should cover three components that mirror the instrument itself: the ion source, the mass analyzer, and the detector. In practice, this means reporting the ionization technique (electrospray ionization, electron ionization at 70 eV, atmospheric pressure chemical ionization, etc.), the type of mass analyzer (time-of-flight, orbitrap, quadrupole, ion trap), and the detection mode used. These aren’t optional details. Journals screen manuscripts specifically for the inclusion of sufficient information about mass spectrometry procedures and data analysis.

Beyond the hardware, include the key acquisition parameters: polarity (positive or negative ion mode), scan range in m/z, resolution, collision energy for tandem MS experiments, and source conditions like capillary voltage and temperature. If you used chromatographic separation before MS, describe the column, mobile phases, gradient, and flow rate. Think of this section as a recipe: another researcher should be able to follow it without emailing you for clarification.

Using Correct Terminology

Small details in notation matter. The IUPAC recommendations specify that the ratio of the mass of an ion to its charge is written as m/z (italicized m, forward slash, italicized z). It is dimensionless, not reported in Daltons. Mass accuracy should be given in parts per million (ppm), and you should state whether you’re reporting monoisotopic or average masses. Getting the terminology right signals to reviewers that you understand the technique and makes your data interpretable across labs.

When reporting peaks, specify whether a signal represents the protonated molecular ion, a deprotonated ion in negative mode, or a common adduct formed with sodium, potassium, formic acid, or acetic acid. In-source fragments and adducts need to be identified and grouped so that each analyte is represented by a single quantified ion. Failing to account for adducts is one of the most common sources of confusion in published MS data.

Reporting Identifications in Proteomics

For protein and peptide identification, journals expect you to report the database you searched against (including version and number of entries), the search engine and its version, the enzymatic digestion parameters, allowed modifications (fixed and variable), precursor and fragment mass tolerances, and your false discovery rate (FDR) threshold. The standard FDR cutoffs are 1% and 5% at the peptide-spectrum match level, with 1% being the most commonly accepted threshold for confident identifications.

The HUPO Proteomics Standards Initiative developed the MIAPE (Minimum Information About a Proteomics Experiment) guidelines as a consensus framework for what must be reported. For quantitative proteomics specifically, the MIAPE Quant guidelines describe the minimum metadata needed for a quantitative dataset to be critically evaluated or a data analysis pipeline to be reproduced. These guidelines were developed with input from a broad range of stakeholders and reflect a true consensus view of the field.

Output files should conform to community data standards. The mzIdentML format, released by the Proteomics Standards Initiative, is the standard exchange format for identification results. For quantitative data, the corresponding format is mzQuantML. Even if your search engine produces native output files, converting to these standard formats ensures interoperability and long-term accessibility.

Identification Levels in Metabolomics

Metabolomics has its own reporting framework, established by the Metabolomics Standards Initiative (MSI), which defines four levels of confidence for compound identification. Getting this right is critical because many published metabolomics studies overstate their identification confidence.

  • Level 1: Identified compounds. These require a minimum of two independent and orthogonal measurements matched against an authentic reference standard analyzed under identical experimental conditions. For example, both retention time and mass spectrum must match, or accurate mass and tandem MS data. Literature values from other laboratories are not sufficient for Level 1.
  • Level 2: Putatively annotated compounds. These are identified without chemical reference standards, based on spectral similarity with public or commercial spectral libraries. If you rely on external laboratory data or literature values rather than your own reference standards, your identifications fall here.
  • Level 3: Putatively characterized compound classes. These are assigned based on characteristic physicochemical properties of a chemical class or spectral similarity to known compounds within that class.
  • Level 4: Unknown compounds. These are unidentified but can still be differentiated and quantified based on their spectral features.

You should clearly state the identification level for every metabolite in your results. A single m/z value alone is insufficient to claim a confident identification. If you used spectral matching, describe or make available the reference spectra. If the spectra come from commercial libraries like NIST or Wiley, name the library and version. If you choose not to provide experimental evidence supporting your identifications, report them as putative.

Fragmentation and Structural Evidence

Fragmentation patterns combined with accurate mass provide the most important information for compound identification in mass spectrometry. When reporting tandem MS (MS/MS) data, include the precursor ion selected, the collision energy, and the key product ions that support your structural assignment. For novel or unexpected identifications, show annotated fragmentation spectra either in the main text or as supplementary figures.

If you’re characterizing a small molecule, report the observed m/z, the calculated exact mass, the mass error in ppm, the molecular formula, and the adduct form. For each compound, explain which fragments correspond to which structural features. This level of detail lets reviewers and readers independently assess whether your assignment is reasonable.

Figures and Spectra

Mass spectra and chromatograms should have clearly labeled axes: m/z on the x-axis and relative abundance (or intensity) on the y-axis for spectra, retention time on the x-axis for chromatograms. Annotate key peaks directly on the figure with their m/z values or compound assignments. If you’re showing extracted ion chromatograms, state the m/z window used for extraction.

For tandem MS spectra used to support structural assignments, label the fragment ions with their proposed structures or standard nomenclature (b/y ions for peptides, for example). Figures should be interpretable on their own without requiring the reader to cross-reference a separate table to understand what each peak represents.

Statistical Reporting for Quantitative Data

Quantitative MS experiments need the same statistical rigor as any other analytical measurement. Report the number of biological and technical replicates, how you handled missing values, your normalization method, and which statistical tests you applied. For differential abundance analyses, state your significance thresholds and whether you corrected for multiple testing.

Calibration curves for targeted quantitation should include the concentration range, number of calibration points, linearity (R² value), limits of detection, and limits of quantitation. If you used internal standards, identify them and explain how they were applied. For label-free or labeled quantitative proteomics, describe the software, its version, and the specific parameters used for quantification.

Depositing Raw Data

Most journals now require you to deposit raw mass spectrometry files and associated metadata in a public repository before your manuscript can be reviewed. For proteomics, the preferred destination is ProteomeXchange, typically through the PRIDE Archive at the European Bioinformatics Institute. Both raw instrument output files and metadata describing the experimental context are mandatory for submission.

The Journal of Proteome Research, for instance, requires authors to provide the repository link and reviewer credentials directly in the manuscript, formatted with the dataset identifier and a temporary login. Hosting data on your own website or institutional server is not acceptable. A typical deposition statement looks like: “The mass spectrometry proteomics data have been deposited to the PRIDE Archive via the ProteomeXchange partner repository with the dataset identifier PXDxxxx.” For metabolomics, MetaboLights and the Metabolomics Workbench serve a similar role.

Depositing your data is not just a checkbox requirement. It makes your work verifiable, allows reanalysis with new tools, and increasingly factors into how reviewers and readers assess the credibility of your findings. Prepare your submission early in the writing process, as formatting metadata and uploading large raw files can take longer than expected.