What Is Factor Loading and How Is It Interpreted?

A factor loading is a number that tells you how strongly a survey question, test item, or measured variable is connected to an underlying pattern (called a factor) in your data. It works like a correlation coefficient, ranging from -1 to +1, where values closer to the extremes mean a stronger relationship and values near zero mean the variable has little to do with that factor. If you’re running a factor analysis or reading a study that uses one, factor loadings are the core output you’ll need to interpret.

How Factor Loadings Work

Imagine you give 500 people a 20-question survey about workplace satisfaction. Some questions cluster together in how people answer them: questions about pay, bonuses, and benefits tend to rise and fall together, while questions about coworkers, team culture, and management form a separate cluster. Factor analysis detects these clusters and assigns each question a loading on each factor. A question about salary might load 0.85 on a “compensation” factor and only 0.10 on a “social environment” factor, telling you it belongs squarely in the compensation group.

The loading itself is the correlation between that individual item and the broader factor. A loading of 0.70 means the item tracks closely with the underlying pattern. A loading of 0.15 means it barely relates at all.

What Counts as a Good Loading

The most widely used minimum threshold is 0.30, which represents roughly 10% shared variance between the item and its factor. Loadings above 0.30 are generally considered moderate, meaning the item contributes meaningfully to the factor. But many researchers set a stricter bar: a loading of at least 0.40 is often preferred for accepting that an item truly belongs to a given factor. Loadings of 0.50 or higher are considered strong, and anything above 0.70 is excellent.

Here’s a practical way to think about these numbers:

Below 0.30: Weak. The item probably doesn’t belong to that factor.
0.30 to 0.39: Moderate but borderline. May be kept depending on context.
0.40 to 0.59: Good. The item fits the factor well enough to retain.
0.60 and above: Strong. The item is a solid marker of the factor.

A satisfactory item will show a good loading (0.40 or higher) on one factor and poor loadings (below 0.30) on all other factors. That clean separation is what researchers look for.

Squaring the Loading Tells You Variance Explained

One of the most useful tricks with factor loadings is squaring them. If an item loads at 0.70 on a factor, squaring that gives you 0.49, meaning the factor explains about 49% of the variation in that item’s responses. An item loading at 0.30 shares only 9% of its variance with the factor, which is why 0.30 is considered the bare minimum worth paying attention to.

This squared value is called communality when you add up the squared loadings across all factors for a single item. It tells you how much of that item’s total variability is captured by the factor solution as a whole. Items with low communality are essentially noise that the model can’t explain well.

Cross-Loadings and Problem Items

Sometimes an item loads at 0.32 or higher on two or more factors at once. This is called a cross-loading, and it signals that the item doesn’t cleanly belong to just one factor. A question about “my manager supports fair pay” might load on both a compensation factor and a management factor, muddying the interpretation of both.

When this happens, you have a few options. If the difference between the highest and second-highest loading is 0.20 or greater, the item can often be kept under the factor where it loads highest. For example, if an item loads 0.65 on one factor and 0.35 on another, that 0.30 gap is large enough to justify keeping it. If the gap is smaller than 0.20, the item is genuinely split between factors and is typically dropped from the analysis. This matters most when each factor already has several strong loaders (0.50 or better), because removing one ambiguous item won’t gut the factor.

How Rotation Changes Factor Loadings

Raw factor loadings straight from the initial extraction are often messy and hard to interpret. Rotation is a mathematical step that redistributes the loadings to make them cleaner, pushing high loadings higher and low loadings lower so that each item more clearly belongs to one factor. The total variance explained doesn’t change; rotation just makes the pattern easier to see.

There are two families of rotation, and they produce slightly different types of loadings. Orthogonal rotation (the most common being Varimax) assumes the factors are completely unrelated to each other. In this case, the factor loading matrix is straightforward: each number is both the item’s loading and its correlation with the factor. You read one table and you’re done.

Oblique rotation (such as Promax or direct oblimin) allows factors to correlate with each other, which is more realistic for most psychological and health data. When you use oblique rotation, you get two matrices instead of one. The pattern matrix contains the unique contribution of each factor to each item, controlling for overlap between factors. This is the table most researchers report and interpret. The structure matrix shows the simple bivariate correlations between items and factors without adjusting for factor overlap, which is less commonly used. The stronger the correlation between factors, the more these two matrices will differ from each other.

Oblique solutions almost always fit the data better than orthogonal ones because they estimate more parameters and don’t force the unrealistic assumption that factors are independent. If your factors turn out to be uncorrelated anyway, an oblique rotation will produce results nearly identical to an orthogonal one, so many analysts default to oblique rotation as the safer choice.

Factor Loadings in Practice

Factor loadings are the backbone of construct validity in questionnaire development. When researchers create a new tool to measure something like depression, anxiety, or quality of life, they use factor analysis to check whether the questions actually group into the dimensions they intended. If a depression questionnaire is supposed to measure mood, energy, and appetite as three separate dimensions, factor loadings should show mood questions loading on one factor, energy questions on another, and appetite questions on a third. When items load where they’re expected to, that’s evidence the questionnaire measures what it claims to measure.

This process typically happens in two stages. In the first stage, called exploratory factor analysis, researchers let the data reveal whatever groupings naturally emerge. They examine the loadings to see how many factors exist and which items belong to each. In the second stage, called confirmatory factor analysis, a new sample is used to test whether the factor structure from the first stage holds up. If a questionnaire was originally developed in English and is being adapted for use in another language or culture, confirmatory factor analysis checks whether the same loading pattern replicates.

For a factor to be accepted as meaningful, it typically needs an eigenvalue greater than 1 (meaning it explains more variance than a single item would on its own) and each item assigned to it should load at 0.40 or higher. Factors with only one or two items, or with all weak loadings, are usually not considered reliable enough to interpret.