Exploratory Factor Analysis

Factor analysis is a method that aims to uncover structures in large variable sets. If you have a data set with many variables, it is possible that some of them are interrelated, i.e. correlate with each other. These correlations are the basis of factor analysis.

The aim of the factor analysis is to divide the variables into groups. The aim is to separate those variables that correlate highly from those that correlate less strongly.

What is a factor?

In factor analysis, the factor can be seen as a hidden variable that influences several actually observed variables.

Or, in other words, several variables are observable phenomena of fewer underlying factors.

In factor analysis, therefore, the variables that are highly correlated with each other are combined. It is assumed that this correlation is due to a non-measurable variable, which is called a factor.

Example Factor Analysis

Factor analysis can be used to answer the following questions:

What structure can be detected in the data?
How can the data be reduced to some factors?

The following table contains examples of content that show where factor analysis is used in different fields of expertise.

Examples

	Question	Variable	Possible factors
Psychological	Can different personality traits be grouped into personality types?	Be sociable, be spontaneous, be curious, be nervous, be aggressive etc.	Neuroticism, Extraversion, Openness for new things, Conscientiousness ,Social compatibility
Business Administration	How can different cost types be summarized in cost characteristics?	Material costs, personnel costs, equipment costs, fixed costs etc.	Influenceability, urgency of coverage

Research questions Factor Analysis

A possible research question might be: Can different personality traits such as outgoing, curious, sociable, or helpful be grouped into personality types such as conscientious, extraverted, or agreeable?

You want to find out whether some of the characteristics sociable, sociable, hard-working, conscientious, warm-hearted or helpful correlate with each other and can be described by an underlying factor. To find out, you created a small survey with DATAtab.

You have interviewed 20 people and have the results output to an Excel table. Here you can find the example data set for the Principal Component Analysis with which you can calculate the example directly online on DATAtab under Factor Analysis Calculator.

Factor load, eigenvalue, communalities

The important terms or characteristic values for a factor analysis are factor charge, eigenvalue and communalities. With their help, it is possible to see how strong the correlation between the individual variables and the factors is.

Factor load

Correlation between a variable and a factor
Loading a variable to a factor

Eigenvalue

The variance explained by a factor
Sum of the squared factor charges

Communalities

Variance of the variables, which is explained by all factors
Sum of the squared factor charges of a variable

Correlation Matrix

The first step in factor analysis is to calculate the correlation matrix. Starting from the correlation matrix, the so-called eigenvalue problem is solved, which is used to calculate the factors.

Factor Analysis and dimensionality

It is important to note, however, that factor analysis does not give a "clear" answer as to how many factors must be used and how these factors can then be interpreted.

There are two common methods to determine the number of required factors: the eigenvalue criterion (Kaiser criterion) and the scree test.

Eigenvalue criterion (Kaiser criterion)

In order to determine the dimensions, i.e. the number of factors, with the help of the Eigenvalue Criterion or the Kaiser Criterion, the Eigenvalues of the individual factors are needed. If these are calculated, all factors with eigenvalues greater than 1 are used.

Scree-Test

In order to determine the number of factors with the help of the scree test or scree plot, the eigenvalues are sorted by size and represented by a line chart. Where there is a bend in the chart, the number of factors can be read.

Furthermore, in the table "Explained total variance" the variance can be read, which explains each individual factor and the cumulative variance.

Communalities

Once the number of factors is determined, the communalities can be calculated. As written above, the communality indicates the variance of the variables, which is explained by all factors. If e.g. three factors were selected, the communalities give the variance portion of the respective variable at that with these three factors to be described can.

Component matrix

The component matrix indicates the factor loads of the factors on the variables. Since the first factor explains most of the variance, the values of the first component or factor are the largest. With this form of representation it is however difficult to make a statement about the factors, therefore this matrix is still rotated.

Rotation Matrix

The computation of the component matrix has the consequence that on the first factor many variables highly load. This results in the fact that the component matrix usually cannot be interpreted meaningfully. Therefore a rotation of this matrix takes place. For this rotation there are different procedures, but the most common is the analytical Varimax rotation.

Varimax Rotation

With the help of the Varimax rotation it should be analytically ensured that per factor certain variables load as high as possible and the other variables load as low as possible. This is obtained when the variance of the factor charges per factor should be as high as possible.

Here it is to be recognized now that "outgoing" and "sociable" lay on Extraversion, "industriously" and "dutiful" lay on conscientiousness and "warmheartedly" and "helpfully" on agreeableness.

Statistics made easy

many illustrative examples
ideal for exams and theses
statistics made easy on 412 pages
5rd revised edition (April 2024)
Only 7.99 €

Free sample

"Super simple written"

"It could not be simpler"

"So many helpful examples"