How to Make a Graph Comparing Two Sets of Data

The best graph for comparing two sets of data depends on what kind of data you have and what question you’re trying to answer. A bar chart works for comparing categories, a line chart works for comparing trends over time, and a scatter plot works for seeing whether two numeric variables are related. Once you pick the right type, the process of building it is straightforward in any tool.

Pick the Right Chart Type First

Before you open any software, ask yourself two questions: what type of data do you have, and what are you trying to show? The answers point you to a specific chart type.

Bar chart: Use this when you’re comparing categories. For example, sales figures for Product A versus Product B across different months. Bar charts are easy to read because your eye naturally compares the length of each bar. If you have two data sets with the same categories, a grouped bar chart places bars side by side so you can compare values directly. A stacked bar chart layers them on top of each other, which is better when you care about the combined total and how each data set contributes to it.

Line chart: Use this when both data sets share the same time axis. Plotting two lines on the same chart lets you compare how each set rises, falls, or stays flat over the same period. This is the go-to choice for things like comparing monthly revenue between two years or tracking two stock prices.

Scatter plot: Use this when you want to see if two numeric variables are related. Each point on the plot represents a pair of values, one from each data set. If the points cluster along a line sloping upward, the two variables tend to increase together (a positive correlation). If they slope downward, one tends to decrease as the other increases. A correlation value of 1 means the points fall on a perfect upward line; a value of negative 1 means a perfect downward line.

Box plot: Use this when you want to compare the spread and center of two data sets rather than individual values. Each box shows the median (the middle value), the range covered by the middle 50% of the data, and the full extent of the data through its “whiskers.” Placing two box plots side by side lets you quickly see which data set has a higher center, which is more spread out, and whether either is skewed.

How to Build a Comparison Chart in Excel

Excel is the most common tool for this, and the process is simple. Start by organizing your data so that both sets share the same column of labels. For example, put your months in column A, your first data set’s values in column B, and your second data set’s values in column C. Give each column a header.

Select all three columns (labels and both value columns), then go to the Insert tab and choose your chart type. Excel will automatically create a chart with two data series, color-coded and labeled in a legend.

If you already have a chart showing one data set and want to add a second, type the new data in cells directly next to your existing source data. Click anywhere on the chart, then drag the blue sizing handles on the worksheet to include the new column. Excel will pull in the second series automatically. If your chart lives on a separate chart sheet, right-click the chart, choose “Select Data,” then click and drag across all the data you want, including the new series. The new data set will appear under Legend Entries in the dialog box.

How to Build a Comparison Chart in Python

If you’re working in Python with Matplotlib, plotting two data sets on the same axes takes just a few lines. Call plt.plot() twice before calling plt.show(), once for each data set, and assign each a different color or label.

When both data sets use the same units, this is all you need. When they use different units or vastly different scales, you can create a second y-axis with the twinx() method. This generates a second vertical axis on the right side of the chart that shares the same x-axis. You plot one data set against the left axis and the other against the right. Color-code each line to match its corresponding axis label so readers can tell which scale belongs to which data set. Here’s the core pattern:

fig, ax1 = plt.subplots()
ax1.set_ylabel('Data Set 1', color='tab:red')
ax1.plot(x_values, data1, color='tab:red')

ax2 = ax1.twinx()
ax2.set_ylabel('Data Set 2', color='tab:blue')
ax2.plot(x_values, data2, color='tab:blue')

plt.show()

Keep the Axes Consistent

The single most important rule when comparing two data sets visually: keep your axes the same. If you place two charts side by side where one y-axis runs from 0 to 100 and the other from 0 to 1,000, your reader will misjudge the comparison. A small change in the second chart can look identical to a large change in the first. Always fix both axes to the same range when the data uses the same units.

If your two data sets have genuinely different scales (say, temperature in degrees versus rainfall in millimeters), you might be tempted to use a dual-axis chart. Be cautious here. The UK’s Office for National Statistics has flagged dual-axis charts as a common source of confusion, noting they can make unrelated trends look connected or hide real differences. A better approach in many cases is to split the data into two separate charts stacked vertically, aligned along the same x-axis. This lets the reader see timing relationships without being misled by scale differences.

Make the Two Data Sets Easy to Tell Apart

Color is your primary tool for distinguishing two data sets, but it needs to be done deliberately. Choose two colors with strong contrast from each other, not light blue and slightly darker blue. Red and blue, orange and navy, or teal and coral all work well. Avoid red and green together, since roughly 8% of men have red-green color blindness.

Always include a legend that clearly labels which color represents which data set. Position it where it won’t overlap with the data, typically above or to the right of the chart. If your chart has only two series, labeling the lines or bars directly (placing the name right next to the data) is often cleaner than a separate legend box.

Title your chart with the comparison you’re making, not just a description of the data. “Sales: Online vs. In-Store, 2020–2024” tells the reader what to look for. “Sales Data” does not.

Grouped vs. Stacked Bar Charts

This choice trips people up, so it’s worth a closer look. A grouped bar chart places the bars for each data set side by side within every category. This makes it easy to compare the two values at each point. Use it when the individual values matter most, like comparing energy consumption between two sectors each quarter.

A stacked bar chart layers one data set on top of the other. The total height of the combined bar shows the overall value, and the colored segments show how much each data set contributes. Use it when the total matters more than the individual comparison, like showing total monthly expenses broken down by two departments.

If you need both, the grouped layout is usually the safer default. Stacked charts make it harder to compare the upper segments because they don’t share a common baseline.

Adding a Trend Line to a Scatter Plot

When you use a scatter plot to compare two numeric variables, adding a trend line (also called a line of best fit) makes the relationship concrete. In Excel, click on any data point in your scatter plot, then right-click and select “Add Trendline.” Choose “Linear” for a straight-line fit. The line’s slope shows the direction and strength of the relationship at a glance.

A steep upward slope means the two variables increase together strongly. A nearly flat line means there’s little relationship between them. You can also display the R-squared value on the chart, which tells you how well the line fits the data. An R-squared of 0.9 means 90% of the variation in one variable is explained by the other. An R-squared of 0.2 means the relationship is weak and other factors are at play.