Exploratory Factor Analysis
Factor analysis is a method that aims to uncover structures in large variable sets. If you have a data set with many variables, it is possible that some of them are interrelated, i.e. correlate with each other. These correlations are the basis of factor analysis.
The aim of the factor analysis is to divide the variables into groups. The aim is to separate those variables that correlate highly from those that correlate less strongly.
What is a factor?
In factor analysis, the factor can be seen as a hidden variable that influences several actually observed variables.
Or, in other words, several variables are observable phenomena of fewer underlying factors.
In factor analysis, therefore, the variables that are highly correlated with each other are combined. It is assumed that this correlation is due to a non-measurable variable, which is called a factor.
Example factor analysis
Factor analysis can be used to answer the following questions:
- What structure can be detected in the data?
- How can the data be reduced to some factors?
The following table contains examples of content that show where factor analysis is used in different fields of expertise.
|Psychological||Can different personality traits be grouped into personality types?||Be sociable, be spontaneous, be curious, be nervous, be aggressive etc.||Neuroticism, Extraversion, Openness for new things, Conscientiousness ,Social compatibility|
|Business Administration||How can different cost types be summarized in cost characteristics?||Material costs, personnel costs, equipment costs, fixed costs etc.||Influenceability, urgency of coverage|
Research questions Factor analysis
A possible research question might be: Can different personality traits such as outgoing, curious, sociable, or helpful be grouped into personality types such as conscientious, extraverted, or agreeable?
You want to find out whether some of the characteristics sociable, sociable, hard-working, conscientious, warm-hearted or helpful correlate with each other and can be described by an underlying factor. To find out, you created a small survey with DATAtab.
You have interviewed 20 people and have the results output to an Excel table. Here you can find the example data set for the Principal Component Analysis with which you can calculate the example directly online on DATAtab under Factor Analysis Calculator.
Factor load, eigenvalue, communalities
The important terms or characteristic values for a factor analysis are factor charge, eigenvalue and communalities. With their help, it is possible to see how strong the correlation between the individual variables and the factors is.
- Correlation between a variable and a factor
- Loading a variable to a factor
- The variance explained by a factor
- Sum of the squared factor charges
- Variance of the variables, which is explained by all factors
- Sum of the squared factor charges of a variable
The first step in factor analysis is to calculate the correlation matrix. Starting from the correlation matrix, the so-called eigenvalue problem is solved, which is used to calculate the factors.
Factor Analysis and dimensionality
It is important to note, however, that factor analysis does not give a "clear" answer as to how many factors must be used and how these factors can then be interpreted.
There are two common methods to determine the number of required factors: the eigenvalue criterion (Kaiser criterion) and the scree test.
Eigenvalue criterion (Kaiser criterion)
In order to determine the dimensions, i.e. the number of factors, with the help of the Eigenvalue Criterion or the Kaiser Criterion, the Eigenvalues of the individual factors are needed. If these are calculated, all factors with eigenvalues greater than 1 are used.
In order to determine the number of factors with the help of the scree test or scree plot, the eigenvalues are sorted by size and represented by a line chart. Where there is a bend in the chart, the number of factors can be read.
Furthermore, in the table "Explained total variance" the variance can be read, which explains each individual factor and the cumulative variance.
Once the number of factors is determined, the communalities can be calculated. As written above, the communality indicates the variance of the variables, which is explained by all factors. If e.g. three factors were selected, the communalities give the variance portion of the respective variable at that with these three factors to be described can.
The component matrix indicates the factor loads of the factors on the variables. Since the first factor explains most of the variance, the values of the first component or factor are the largest. With this form of representation it is however difficult to make a statement about the factors, therefore this matrix is still rotated.
The computation of the component matrix has the consequence that on the first factor many variables highly load. This results in the fact that the component matrix usually cannot be interpreted meaningfully. Therefore a rotation of this matrix takes place. For this rotation there are different procedures, but the most common is the analytical Varimax rotation.
With the help of the Varimax rotation it should be analytically ensured that per factor certain variables load as high as possible and the other variables load as low as possible. This is obtained when the variance of the factor charges per factor should be as high as possible.
Here it is to be recognized now that "outgoing" and "sociable" lay on Extraversion, "industriously" and "dutiful" lay on conscientiousness and "warmheartedly" and "helpfully" on agreeableness.
Statistics made easy
- many illustrative examples
- ideal for exams and theses
- statistics made easy on 276 pages
- 3rd revised edition (July 2023)
- Only 6.99 €
"Super simple written"
"It could not be simpler"
"So many helpful examples"