How to Read a Molecular Formula Step by Step

A molecular formula is a shorthand that tells you exactly which elements are in a compound and how many atoms of each. Once you learn to recognize three things, element symbols, subscripts, and parentheses, you can decode any formula you encounter. Here’s how each piece works.

Element Symbols and Capitalization

Every element on the periodic table has a one- or two-letter symbol. The first letter is always capitalized, and if there’s a second letter, it’s always lowercase. This capitalization rule is how you tell where one element ends and the next begins. H is hydrogen, He is helium, and they are not the same thing. In a formula like NaCl (table salt), the capital N and capital C each signal a new element: Na is sodium, Cl is chlorine.

This matters most when you’re reading formulas with back-to-back letters. Co means cobalt (one element), while CO means one carbon atom and one oxygen atom (two elements). If you miss the capitalization, you’ll misread the entire compound.

What Subscripts Tell You

Subscripts are the small numbers written to the lower right of an element symbol. They tell you how many atoms of that element are present in a single molecule. In H₂O, the subscript 2 after H means there are two hydrogen atoms. The O has no subscript, which means there’s exactly one oxygen atom. When no number appears, it always means one.

A subscript applies only to the element symbol directly before it. In CO₂ (carbon dioxide), the 2 belongs to the oxygen, not the carbon. So you have one carbon atom and two oxygen atoms.

How Parentheses Work

Parentheses group atoms together, and a subscript outside the closing parenthesis acts as a multiplier for everything inside. This comes up frequently in ionic compounds where a group of atoms (called a polyatomic ion) repeats more than once.

Take calcium phosphate: Ca₃(PO₄)₂. The subscript 2 outside the parentheses means you multiply every atom inside by 2. So PO₄ becomes 2 phosphorus atoms and 8 oxygen atoms (4 × 2). Add the 3 calcium atoms out front, and the full count is 3 calcium, 2 phosphorus, and 8 oxygen.

Another example: Mg(NO₃)₂. Inside the parentheses you have one nitrogen and three oxygens. The outside subscript of 2 doubles both, giving you 1 magnesium, 2 nitrogen atoms, and 6 oxygen atoms. If a formula has no parentheses, you don’t need this step.

Counting Atoms Step by Step

For any formula, you can find the total number of each element’s atoms by working left to right:

Identify each element by looking for capital letters. Each capital letter starts a new element symbol.
Read the subscript directly after each symbol. No number means one.
Distribute through parentheses by multiplying each subscript inside by the subscript outside.
Add up repeated elements. If the same element appears in more than one place in the formula, sum them. In Ca(OH)₂, you get 1 calcium, 2 oxygen atoms (1 × 2), and 2 hydrogen atoms (1 × 2), for a total of 5 atoms.

Practice with glucose, C₆H₁₂O₆: six carbons, twelve hydrogens, six oxygens, for 24 atoms total.

Coefficients vs. Subscripts

In a chemical equation, you’ll sometimes see a full-sized number in front of a formula, like 2H₂O. That number is a coefficient, and it means something different from a subscript. The coefficient tells you how many separate molecules (or formula units) of that substance are involved in the reaction. So 2H₂O means two water molecules. Each water molecule still has 2 hydrogens and 1 oxygen, giving you 4 hydrogen atoms and 2 oxygen atoms total across both molecules.

The key distinction: subscripts are part of the compound’s identity and never change. Coefficients describe quantity in a reaction and can be adjusted to balance an equation.

The Order Elements Are Listed

You might wonder why carbon always seems to come first in organic formulas. Most chemical databases and textbooks follow the Hill system: carbon is listed first, hydrogen second, and all remaining elements in alphabetical order. If a compound contains no carbon, every element is simply listed alphabetically. That’s why ethanol is written C₂H₆O (carbon, then hydrogen, then oxygen) and sodium chloride is NaCl (alphabetical).

This convention is not universal in every context. Inorganic chemistry sometimes places the more metallic element first (NaCl, not ClNa), and acids traditionally start with hydrogen (HCl). But for most formulas you’ll encounter in databases, the Hill system is standard.

Molecular Formula vs. Empirical Formula

A molecular formula shows the actual number of each atom in a single molecule. An empirical formula shows only the simplest whole-number ratio. These are sometimes identical, but not always. Benzene’s molecular formula is C₆H₆, but its empirical formula is CH, a 1:1 ratio of carbon to hydrogen. Glucose is C₆H₁₂O₆, while its empirical formula reduces to CH₂O.

The molecular formula is always a whole-number multiple of the empirical formula. For ascorbic acid (vitamin C), the empirical formula is C₃H₄O₃, and doubling every subscript gives the molecular formula C₆H₈O₆. If a compound’s formula can’t be reduced further, the two formulas are the same. Water’s molecular formula H₂O is already in its simplest ratio.

What a Molecular Formula Doesn’t Tell You

A molecular formula tells you the ingredient list of a molecule, not how those ingredients are arranged. Two compounds can share the exact same molecular formula but have completely different structures and properties. These are called isomers. Ethanol (drinking alcohol) and dimethyl ether (a gas used as a propellant) both have the formula C₂H₆O, but their atoms are connected differently.

To see how atoms are actually bonded to each other, you need a structural formula, which uses lines to represent bonds between atoms. Each line represents a pair of shared electrons holding two atoms together. A molecular formula is the starting point; a structural formula is the full blueprint.

Special Notation: The Dot in Hydrates

Some formulas include a centered dot separating two parts, like CuSO₄·5H₂O. This is a hydrate, an ionic compound that has water molecules physically incorporated into its crystal structure. The dot doesn’t mean multiplication. It means “associated with.” CuSO₄·5H₂O tells you that for every unit of copper sulfate, five water molecules are trapped within the solid. The number before H₂O uses the same Greek prefixes you see in chemistry naming: 5 water molecules makes this a “pentahydrate.”

Ionic Compounds and Formula Units

Not every chemical formula represents a discrete molecule. Ionic compounds like NaCl don’t exist as individual pairs of atoms. Instead, billions of sodium and chloride ions arrange themselves into a repeating three-dimensional grid called a crystal lattice. The formula NaCl is a “formula unit,” the simplest ratio of ions that produces a neutral compound. You read it the same way you’d read any formula (1 sodium, 1 chlorine), but keep in mind it describes a ratio rather than a standalone molecule.

This is why ionic compound formulas are always already in their simplest ratio. There’s no distinction between a molecular and empirical formula for them, because the formula unit is inherently empirical.