Load the BFI data from the ‘psych’ package (??bfi). This contains data on 2800 participants completing items relating to the ‘big five’ from the IPIP pool. You’ll have to subset the variables for your factor analysis.
Conduct a Bartlett’s test & KMO test.
Conduct an exploratory factor analysis (using ‘minres’ as method), using parallel analysis, discuss the scree plot, Very Simple Structure and Velicer map test.
Extract a five factor model (use varimax rotation), export the factor loadings of these five factors. Discuss the RMSEA and TLI for that five factor model.
Make a plot for the factors.
setwd("~/Dropbox/Teaching_MRes_Northumbria/Lecture7")
require(psych)
## Loading required package: psych
Data<-psych::bfi
big_5<-Data[,c(1:25)]
Bartlett’s test for sphericity was significant suggesting that factor analysis is appropriate (\(\chi^2\)(24) = 1744.7, p < .0001).
bartlett.test(big_5)
##
## Bartlett test of homogeneity of variances
##
## data: big_5
## Bartlett's K-squared = 1744.7, df = 24, p-value < 2.2e-16
All 25 items showed middling to meritorious adequacy for factor analysis (all MSA\(\geq\).73).
require(psych)
KMO(big_5)
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = big_5)
## Overall MSA = 0.85
## MSA for each item =
## A1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1
## 0.74 0.84 0.87 0.87 0.90 0.83 0.79 0.85 0.82 0.86 0.83 0.88 0.89 0.87 0.89 0.78
## N2 N3 N4 N5 O1 O2 O3 O4 O5
## 0.78 0.86 0.88 0.86 0.85 0.78 0.84 0.76 0.76
Extract a large number of factors and examine
require(psych)
fa <- fa(big_5,8, fm = 'minres', rotate='varimax', fa = 'fa')
fa
## Factor Analysis using method = minres
## Call: fa(r = big_5, nfactors = 8, rotate = "varimax", fm = "minres",
## fa = "fa")
## Standardized loadings (pattern matrix) based upon correlation matrix
## MR2 MR3 MR5 MR1 MR6 MR4 MR8 MR7 h2 u2 com
## A1 0.09 0.04 -0.47 -0.02 0.17 0.22 -0.12 0.00 0.33 0.67 2.0
## A2 0.01 0.14 0.70 0.14 0.06 -0.03 0.14 0.19 0.58 0.42 1.4
## A3 -0.01 0.08 0.66 0.19 0.23 0.02 -0.09 -0.02 0.54 0.46 1.5
## A4 -0.07 0.21 0.43 0.11 0.07 0.11 -0.14 -0.08 0.29 0.71 2.2
## A5 -0.15 0.09 0.51 0.28 0.29 0.07 -0.04 -0.09 0.47 0.53 2.7
## C1 -0.02 0.58 -0.01 0.05 0.17 -0.08 0.12 0.00 0.38 0.62 1.3
## C2 0.06 0.67 0.08 -0.01 0.16 0.01 0.03 -0.06 0.50 0.50 1.2
## C3 -0.04 0.54 0.13 -0.01 0.02 0.03 -0.02 0.07 0.31 0.69 1.2
## C4 0.19 -0.61 -0.10 -0.12 0.15 0.30 0.15 -0.05 0.57 0.43 2.2
## C5 0.26 -0.51 -0.12 -0.12 0.02 0.09 0.37 0.01 0.50 0.50 2.7
## E1 0.00 0.03 -0.14 -0.68 -0.04 0.10 -0.06 -0.05 0.51 0.49 1.2
## E2 0.22 -0.08 -0.17 -0.65 -0.14 0.06 0.16 -0.06 0.55 0.45 1.7
## E3 -0.01 0.08 0.26 0.38 0.51 -0.04 -0.08 -0.04 0.49 0.51 2.6
## E4 -0.12 0.12 0.29 0.60 0.23 0.20 -0.07 -0.12 0.59 0.41 2.4
## E5 0.00 0.31 0.15 0.37 0.33 -0.06 -0.04 0.34 0.49 0.51 4.4
## N1 0.80 -0.07 -0.13 0.01 0.01 0.09 -0.15 0.22 0.74 0.26 1.3
## N2 0.77 -0.04 -0.12 0.00 -0.02 0.02 0.00 0.24 0.66 0.34 1.3
## N3 0.75 -0.05 -0.03 -0.04 0.02 0.04 0.06 -0.12 0.58 0.42 1.1
## N4 0.58 -0.14 -0.06 -0.30 0.00 0.00 0.19 -0.17 0.51 0.49 2.1
## N5 0.53 0.00 0.06 -0.10 -0.09 0.17 0.14 -0.18 0.39 0.61 1.8
## O1 -0.04 0.12 0.04 0.07 0.53 -0.25 0.05 0.08 0.37 0.63 1.7
## O2 0.13 -0.07 0.05 0.00 -0.13 0.54 0.08 0.02 0.34 0.66 1.3
## O3 0.00 0.10 0.10 0.20 0.53 -0.35 0.10 -0.03 0.47 0.53 2.4
## O4 0.19 0.03 0.09 -0.19 0.22 -0.20 0.31 -0.03 0.27 0.73 4.5
## O5 0.05 -0.05 -0.04 -0.03 -0.16 0.57 -0.10 -0.04 0.37 0.63 1.3
##
## MR2 MR3 MR5 MR1 MR6 MR4 MR8 MR7
## SS loadings 2.66 1.97 1.94 1.90 1.36 1.11 0.48 0.39
## Proportion Var 0.11 0.08 0.08 0.08 0.05 0.04 0.02 0.02
## Cumulative Var 0.11 0.19 0.26 0.34 0.39 0.44 0.46 0.47
## Proportion Explained 0.23 0.17 0.16 0.16 0.12 0.09 0.04 0.03
## Cumulative Proportion 0.23 0.39 0.56 0.72 0.83 0.93 0.97 1.00
##
## Mean item complexity = 2
## Test of the hypothesis that 8 factors are sufficient.
##
## df null model = 300 with the objective function = 7.23 with Chi Square = 20163.79
## df of the model are 128 and the objective function was 0.18
##
## The root mean square of the residuals (RMSR) is 0.01
## The df corrected root mean square of the residuals is 0.02
##
## The harmonic n.obs is 2762 with the empirical chi square 274.09 with prob < 1.2e-12
## The total n.obs was 2800 with Likelihood Chi Square = 502.89 with prob < 7.1e-46
##
## Tucker Lewis Index of factoring reliability = 0.956
## RMSEA index = 0.032 and the 90 % confidence intervals are 0.029 0.035
## BIC = -513.09
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy
## MR2 MR3 MR5 MR1 MR6 MR4
## Correlation of (regression) scores with factors 0.93 0.86 0.85 0.84 0.79 0.78
## Multiple R square of scores with factors 0.86 0.75 0.73 0.70 0.62 0.61
## Minimum correlation of possible factor scores 0.71 0.49 0.46 0.41 0.25 0.22
## MR8 MR7
## Correlation of (regression) scores with factors 0.65 0.66
## Multiple R square of scores with factors 0.42 0.44
## Minimum correlation of possible factor scores -0.16 -0.12
The ‘elbow’ in the graph (scree test) suggests five factors. Parallel analysis suggests 6 factor solution (!). The Kaiser criterion suggests a 3 factor solution. The Velicer Map test suggests 5 factors. The very simple structure test suggests 3-4 factors. Note that you have two ‘vss’ complexity values, the second suggests 4-5 factors.
parallel <- fa.parallel(big_5, fm = 'minres', fa = 'fa')
## Parallel analysis suggests that the number of factors = 6 and the number of components = NA
parallel
## Call: fa.parallel(x = big_5, fm = "minres", fa = "fa")
## Parallel analysis suggests that the number of factors = 6 and the number of components = NA
##
## Eigen Values of
##
## eigen values of factors
## [1] 4.26 1.91 1.26 0.95 0.71 0.26 0.01 -0.02 -0.09 -0.14 -0.16 -0.21
## [13] -0.23 -0.24 -0.26 -0.27 -0.30 -0.31 -0.33 -0.34 -0.37 -0.37 -0.43 -0.44
## [25] -0.57
##
## eigen values of simulated factors
## [1] 0.22 0.16 0.13 0.12 0.10 0.09 0.08 0.06 0.05 0.04 0.03 0.01
## [13] 0.00 -0.01 -0.02 -0.03 -0.04 -0.05 -0.06 -0.08 -0.09 -0.10 -0.12 -0.13
## [25] -0.16
##
## eigen values of components
## [1] 5.04 2.74 2.11 1.83 1.54 1.11 0.85 0.81 0.73 0.70 0.68 0.66 0.63 0.60 0.56
## [16] 0.54 0.52 0.50 0.49 0.45 0.43 0.41 0.41 0.39 0.28
##
## eigen values of simulated components
## [1] NA
VSS(big_5, rotate= "varimax", n.obs= 2800, n=8)
##
## Very Simple Structure
## Call: vss(x = x, n = n, rotate = rotate, diagonal = diagonal, fm = fm,
## n.obs = n.obs, plot = plot, title = title, use = use, cor = cor)
## VSS complexity 1 achieves a maximimum of 0.58 with 4 factors
## VSS complexity 2 achieves a maximimum of 0.74 with 5 factors
##
## The Velicer MAP achieves a minimum of 0.01 with 5 factors
## BIC achieves a minimum of -513.09 with 8 factors
## Sample Size adjusted BIC achieves a minimum of -106.39 with 8 factors
##
## Statistics by number of factors
## vss1 vss2 map dof chisq prob sqresid fit RMSEA BIC SABIC complex
## 1 0.49 0.00 0.024 275 11863 0.0e+00 25.9 0.49 0.123 9680 10554 1.0
## 2 0.54 0.63 0.018 251 7362 0.0e+00 18.6 0.63 0.101 5370 6168 1.2
## 3 0.56 0.70 0.017 228 5096 0.0e+00 14.6 0.71 0.087 3286 4010 1.3
## 4 0.58 0.74 0.015 206 3422 0.0e+00 11.5 0.77 0.075 1787 2441 1.4
## 5 0.54 0.74 0.015 185 1809 4.3e-264 9.4 0.81 0.056 341 928 1.5
## 6 0.54 0.71 0.016 165 1032 1.8e-125 8.3 0.84 0.043 -277 247 1.8
## 7 0.51 0.70 0.019 146 708 1.2e-74 7.9 0.85 0.037 -451 13 1.9
## 8 0.51 0.66 0.022 128 503 7.1e-46 7.4 0.85 0.032 -513 -106 2.0
## eChisq SRMR eCRMS eBIC
## 1 23725 0.119 0.124 21542
## 2 12173 0.085 0.093 10180
## 3 7019 0.065 0.074 5209
## 4 3606 0.046 0.056 1971
## 5 1412 0.029 0.037 -57
## 6 649 0.020 0.027 -661
## 7 435 0.016 0.023 -724
## 8 278 0.013 0.020 -738
fa_5<-fa(big_5,5, fm = 'minres', rotate='varimax', fa = 'fa')
fa_5
## Factor Analysis using method = minres
## Call: fa(r = big_5, nfactors = 5, rotate = "varimax", fm = "minres",
## fa = "fa")
## Standardized loadings (pattern matrix) based upon correlation matrix
## MR2 MR1 MR3 MR5 MR4 h2 u2 com
## A1 0.12 0.04 0.02 -0.41 -0.08 0.19 0.81 1.3
## A2 0.03 0.21 0.14 0.62 0.07 0.45 0.55 1.4
## A3 0.01 0.32 0.11 0.64 0.06 0.52 0.48 1.6
## A4 -0.06 0.18 0.23 0.42 -0.11 0.28 0.72 2.2
## A5 -0.11 0.39 0.09 0.53 0.06 0.46 0.54 2.0
## C1 0.01 0.06 0.54 0.02 0.20 0.33 0.67 1.3
## C2 0.09 0.03 0.65 0.11 0.11 0.45 0.55 1.2
## C3 -0.02 0.02 0.55 0.12 0.00 0.32 0.68 1.1
## C4 0.25 -0.06 -0.61 -0.04 -0.11 0.45 0.55 1.5
## C5 0.30 -0.17 -0.55 -0.06 0.03 0.43 0.57 1.8
## E1 0.04 -0.57 0.04 -0.10 -0.07 0.35 0.65 1.1
## E2 0.25 -0.68 -0.09 -0.10 -0.04 0.54 0.46 1.4
## E3 0.02 0.54 0.08 0.27 0.27 0.44 0.56 2.0
## E4 -0.10 0.65 0.10 0.30 -0.08 0.53 0.47 1.6
## E5 0.03 0.50 0.32 0.09 0.21 0.40 0.60 2.2
## N1 0.77 0.07 -0.04 -0.22 -0.08 0.65 0.35 1.2
## N2 0.75 0.03 -0.03 -0.19 -0.02 0.60 0.40 1.1
## N3 0.73 -0.06 -0.07 -0.03 0.00 0.55 0.45 1.0
## N4 0.59 -0.33 -0.17 0.00 0.07 0.49 0.51 1.8
## N5 0.54 -0.15 -0.03 0.10 -0.15 0.35 0.65 1.4
## O1 0.01 0.22 0.12 0.06 0.50 0.31 0.69 1.5
## O2 0.19 0.00 -0.10 0.09 -0.45 0.26 0.74 1.5
## O3 0.02 0.30 0.08 0.13 0.59 0.46 0.54 1.7
## O4 0.23 -0.18 -0.01 0.16 0.37 0.25 0.75 2.6
## O5 0.10 -0.01 -0.06 -0.02 -0.54 0.30 0.70 1.1
##
## MR2 MR1 MR3 MR5 MR4
## SS loadings 2.69 2.44 1.98 1.78 1.48
## Proportion Var 0.11 0.10 0.08 0.07 0.06
## Cumulative Var 0.11 0.21 0.28 0.36 0.41
## Proportion Explained 0.26 0.24 0.19 0.17 0.14
## Cumulative Proportion 0.26 0.49 0.69 0.86 1.00
##
## Mean item complexity = 1.5
## Test of the hypothesis that 5 factors are sufficient.
##
## df null model = 300 with the objective function = 7.23 with Chi Square = 20163.79
## df of the model are 185 and the objective function was 0.65
##
## The root mean square of the residuals (RMSR) is 0.03
## The df corrected root mean square of the residuals is 0.04
##
## The harmonic n.obs is 2762 with the empirical chi square 1392.16 with prob < 5.6e-184
## The total n.obs was 2800 with Likelihood Chi Square = 1808.94 with prob < 4.3e-264
##
## Tucker Lewis Index of factoring reliability = 0.867
## RMSEA index = 0.056 and the 90 % confidence intervals are 0.054 0.058
## BIC = 340.53
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy
## MR2 MR1 MR3 MR5 MR4
## Correlation of (regression) scores with factors 0.92 0.88 0.86 0.84 0.82
## Multiple R square of scores with factors 0.85 0.77 0.73 0.70 0.68
## Minimum correlation of possible factor scores 0.69 0.53 0.47 0.40 0.35
It should be clear what they map on to… .
require(stargazer)
## Loading required package: stargazer
##
## Please cite as:
## Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
require(plyr)
## Loading required package: plyr
factor_loadings<-as.data.frame(as.matrix.data.frame(fa_5$loadings))
factor_loadings<-plyr::rename(factor_loadings, c("V1"="Neuroticism","V2"="Extraversion", "V3"="Conscientiousness", "V4"="Agreeableness", "V5"="Openness"))
stargazer(factor_loadings, summary = FALSE,out= "results_loadings_exercise.html", header=FALSE)
##
## \begin{table}[!htbp] \centering
## \caption{}
## \label{}
## \begin{tabular}{@{\extracolsep{5pt}} cccccc}
## \\[-1.8ex]\hline
## \hline \\[-1.8ex]
## & Neuroticism & Extraversion & Conscientiousness & Agreeableness & Openness \\
## \hline \\[-1.8ex]
## 1 & $0.122$ & $0.036$ & $0.024$ & $$-$0.410$ & $$-$0.083$ \\
## 2 & $0.030$ & $0.206$ & $0.144$ & $0.615$ & $0.069$ \\
## 3 & $0.006$ & $0.320$ & $0.106$ & $0.637$ & $0.058$ \\
## 4 & $$-$0.060$ & $0.183$ & $0.230$ & $0.423$ & $$-$0.106$ \\
## 5 & $$-$0.112$ & $0.394$ & $0.089$ & $0.533$ & $0.062$ \\
## 6 & $0.006$ & $0.062$ & $0.536$ & $0.023$ & $0.195$ \\
## 7 & $0.088$ & $0.030$ & $0.647$ & $0.108$ & $0.105$ \\
## 8 & $$-$0.021$ & $0.020$ & $0.550$ & $0.120$ & $$-$0.004$ \\
## 9 & $0.254$ & $$-$0.063$ & $$-$0.607$ & $$-$0.040$ & $$-$0.113$ \\
## 10 & $0.296$ & $$-$0.172$ & $$-$0.553$ & $$-$0.055$ & $0.034$ \\
## 11 & $0.042$ & $$-$0.574$ & $0.040$ & $$-$0.100$ & $$-$0.068$ \\
## 12 & $0.246$ & $$-$0.680$ & $$-$0.090$ & $$-$0.102$ & $$-$0.036$ \\
## 13 & $0.021$ & $0.540$ & $0.079$ & $0.265$ & $0.266$ \\
## 14 & $$-$0.099$ & $0.645$ & $0.100$ & $0.297$ & $$-$0.084$ \\
## 15 & $0.032$ & $0.501$ & $0.317$ & $0.089$ & $0.207$ \\
## 16 & $0.769$ & $0.075$ & $$-$0.039$ & $$-$0.217$ & $$-$0.085$ \\
## 17 & $0.748$ & $0.032$ & $$-$0.030$ & $$-$0.194$ & $$-$0.019$ \\
## 18 & $0.734$ & $$-$0.057$ & $$-$0.066$ & $$-$0.032$ & $$-$0.0002$ \\
## 19 & $0.585$ & $$-$0.333$ & $$-$0.173$ & $$-$0.003$ & $0.066$ \\
## 20 & $0.541$ & $$-$0.154$ & $$-$0.034$ & $0.103$ & $$-$0.146$ \\
## 21 & $0.005$ & $0.219$ & $0.119$ & $0.064$ & $0.496$ \\
## 22 & $0.189$ & $0.004$ & $$-$0.097$ & $0.086$ & $$-$0.453$ \\
## 23 & $0.024$ & $0.303$ & $0.083$ & $0.132$ & $0.590$ \\
## 24 & $0.231$ & $$-$0.182$ & $$-$0.014$ & $0.156$ & $0.374$ \\
## 25 & $0.096$ & $$-$0.006$ & $$-$0.057$ & $$-$0.018$ & $$-$0.536$ \\
## \hline \\[-1.8ex]
## \end{tabular}
## \end{table}
While the five factor model could be considered a close fit in RMSEA (.056), it was not in terms of TLI (.867).
Hurray, this sort of looks like the ‘big 5’!
require(GPArotation)
fa.diagram(fa_5, marg=c(.01,.01,1,.01))
require(semPlot)
semplot1<-semPlotModel(fa_5$loadings)
semPaths(semplot1, what="std", layout="circle", nCharNodes = 6)
plot(fa_5,labels=names(big_5),cex=.7, ylim=c(-.1,1))
The end!