How to Present Flow Cytometry Data for Publication

Presenting flow cytometry data effectively comes down to a few core decisions: choosing the right plot type, showing your gating strategy clearly, picking appropriate summary statistics, and scaling your axes so populations are visible. Whether you’re assembling a figure for a journal manuscript or a conference poster, the principles are the same.

Show Your Gating Strategy Sequentially

The gating strategy figure is the backbone of any flow cytometry presentation. Gating is the sequential process of narrowing down to your cell population of interest using a series of markers, and your figure should walk the reader through each step in order. Start with the broadest gate (typically a scatter plot to exclude debris), then show each subsequent refinement as its own panel arranged left to right or top to bottom. Connect panels with arrows to indicate the parent-child relationship between gates.

Each panel should display the percentage of events relative to its parent gate, not the total events. This lets readers evaluate how selective each step is. For example, a typical immunophenotyping figure might start with a forward scatter vs. side scatter plot to isolate lymphocytes, then gate on a lineage marker like CD3 to identify T cells, and finally separate CD4 and CD8 populations. Label every axis with the marker name, and include the fluorochrome if you’re writing for a technical audience that needs to evaluate spillover or compensation choices.

A common mistake is showing only the final gate without the upstream hierarchy. Reviewers and readers need to see the full path to assess whether your population is clean. If your gating involved excluding dead cells with a viability dye or dumping out unwanted lineages, show those steps explicitly.

Choosing the Right Plot Type

Dot plots, contour plots, and pseudocolor (density) plots each serve different purposes, and the choice matters more than most people realize.

Dot plots work well for small event counts (under roughly 5,000 events per panel) where individual cells are distinguishable. At higher event counts, dots pile on top of each other and obscure dense regions.
Pseudocolor or density plots solve the overplotting problem by coloring each dot based on local event density. These are the best default choice for most gating figures because they let readers see where the bulk of events fall.
Contour plots draw lines around regions of equal density, similar to a topographic map. They’re clean and visually appealing, but they can hide small outlier populations that sit outside the lowest contour line. Adding a 5% outlier display (showing the sparse events as individual dots) fixes this.
Histograms are the standard for single-parameter comparisons, such as overlaying a stained sample against an unstained control. When overlaying multiple histograms, use distinct colors and consider offsetting them slightly so peaks don’t obscure each other.

Use the Right Axis Transformation

How you scale your axes has a dramatic effect on how your data looks. Traditional logarithmic scales work fine for brightly stained populations, but they distort or compress events near zero. Unstained or dimly stained cells often produce values at or below zero (due to compensation), and a standard log scale can’t display negative numbers. This pushes those events against the axis, creating an artificial pile-up that obscures the true distribution.

Biexponential (sometimes called “logicle”) and hyperbolic arcsine transformations fix this by behaving linearly near zero and logarithmically at higher intensities, with a smooth transition between the two. This gives a more accurate visual representation of dim and negative populations. Most modern analysis software offers biexponential scaling as a default, and it’s now the expected standard for published figures. If your software still defaults to log scaling, switch to biexponential for any channel where you see data compressed against an axis.

Reporting Fluorescence Intensity

When you need to quantify how bright a population is, you’ll report a fluorescence intensity value. The abbreviation “MFI” gets used loosely in the literature, but the specific type of average you choose changes your results significantly.

The median fluorescence intensity is the preferred measure for data displayed on a logarithmic scale. Flow cytometry data typically follows a log-normal distribution, meaning it’s skewed to the right. An arithmetic mean exaggerates that skew, pulling the average toward the bright tail and misrepresenting where most cells actually sit. The geometric mean is the second-best option, as it accounts for log-normal behavior, but the median remains the most robust choice because it’s less sensitive to outliers and off-scale events.

There’s one important exception: if your population is bimodal (two distinct peaks), no single average is meaningful. A bimodal distribution violates the assumption of normality that any average depends on. In that case, gate each subpopulation separately and report the percentage of cells in each peak. This gives readers far more useful information than a single misleading number sitting between two peaks.

Always specify which type of MFI you’re reporting. “MFI” alone is ambiguous. Write “median fluorescence intensity” on first use.

Summary Statistics and Bar Graphs

For comparing populations across conditions or time points, the standard approach is a bar graph or scatter plot showing the percentage of positive cells or the median fluorescence intensity, with individual data points overlaid. Show every replicate as a dot rather than hiding them behind a bar. Error bars should represent standard deviation or standard error of the mean, clearly labeled. If you’re reporting statistical comparisons, place significance brackets between the groups being compared and label them with p-values or significance symbols defined in the legend.

When presenting percentages, clarify the denominator. “20% CD4+ cells” is meaningless without knowing 20% of what: total live cells, total lymphocytes, or total CD3+ T cells. State the parent population in every figure legend and axis label.

High-Dimensional Visualization With UMAP and t-SNE

For panels with many markers, dimensionality reduction plots like UMAP and t-SNE compress the full marker space into a two-dimensional map where similar cells land near each other. These have become a standard complement to traditional gating, especially for exploratory analysis or for visualizing complex phenotypes that are hard to capture in bivariate plots.

UMAP is generally preferred over t-SNE because it better preserves the global structure of the data, meaning the distances between clusters are more meaningful. When presenting a UMAP figure, include several versions of the same plot: one colored by cluster identity (from an algorithm like FlowSOM), and additional panels colored by individual marker expression using a continuous color scale. This lets readers see which markers define each cluster. Heatmaps showing median expression of each marker per cluster are a useful companion figure that provides the quantitative detail the UMAP alone can’t convey.

Label your axes as “UMAP1” and “UMAP2” (or “tSNE1” and “tSNE2”), but note in the legend that the axis values are arbitrary and distances are not directly interpretable as biological differences. Report the key parameters you used to generate the plot, including the number of neighbors and the minimum distance for UMAP, or the perplexity for t-SNE, since these settings change the appearance of the output substantially.

Color Choices and Accessibility

About 8% of men have some form of color vision deficiency, so your color choices directly affect whether a significant portion of your audience can read your figures. The most common form makes red and green hard to distinguish, which is unfortunate because red-green combinations are widespread in flow cytometry overlays.

Replace red and green with blue and orange, or blue and red, which remain distinguishable to nearly all viewers. For continuous color maps on density plots or UMAP expression overlays, use a perceptually uniform palette like Viridis, which transitions from purple through teal to yellow and has been tested for color-blind accessibility. If you must use red and green (for example, to match a well-known convention), differentiate them by brightness so they’re distinguishable even in grayscale.

What to Include in Your Figure Legend

A complete figure legend for flow cytometry data should cover the gating hierarchy, the markers and fluorochromes used, the transformation applied to each axis, the type of plot (contour with 5% outlier display, pseudocolor, etc.), and what the percentages represent. For summary graphs, state the number of replicates, the error bar type, and the statistical test used.

The MIFlowCyt standard, adopted by the International Society for Advancement of Cytometry, provides a formal checklist organized into four sections: experiment overview (including hypothesis and quality control measures), sample description (specimen source, processing, and reagents), instrument details (laser configuration, detectors, fluidics), and data analysis (compensation method, gating definitions, and summary statistics). Many journals now expect manuscripts to address these categories, either in the methods section or supplementary materials. Even if your target journal doesn’t explicitly require MIFlowCyt compliance, following it will preempt most reviewer questions.

Sharing Raw Data

Several major journals, including those from Nature Publishing Group, PLOS, and Cytometry A, now recommend or require authors to deposit their underlying .fcs files in FlowRepository, a public database supported by ISAC. Depositing raw data lets other researchers verify your gating, reanalyze with different tools, or use your data as a reference. When you deposit, you’ll receive a repository ID to include in your manuscript. For clinical data, de-identification is required before upload, and FlowRepository provides guidance on stripping patient identifiers while preserving analytical value.

Even when not required, linking your figures to deposable raw data strengthens your paper’s credibility and increasingly serves as a baseline expectation from reviewers.