Exercise 7 instructions.

Load the BFI data from the ‘psych’ package (??bfi). This contains data on 2800 participants completing items relating to the ‘big five’ from the IPIP pool. You’ll have to subset the variables for your factor analysis.

Conduct a Bartlett’s test & KMO test.

Conduct an exploratory factor analysis (using ‘minres’ as method), using parallel analysis, discuss the scree plot, Very Simple Structure and Velicer map test.

Extract a five factor model (use varimax rotation), export the factor loadings of these five factors. Discuss the RMSEA and TLI for that five factor model.

Make a plot for the factors.

Load the data and subset all personality items.

setwd("~/Dropbox/Teaching_MRes_Northumbria/Lecture7")
require(psych)
## Loading required package: psych
Data<-psych::bfi 
big_5<-Data[,c(1:25)]

Bartlett’s test and KMO test

Bartlett’s test for sphericity was significant suggesting that factor analysis is appropriate (\(\chi^2\)(24) = 1744.7, p < .0001).

bartlett.test(big_5)
## 
##  Bartlett test of homogeneity of variances
## 
## data:  big_5
## Bartlett's K-squared = 1744.7, df = 24, p-value < 2.2e-16

All 25 items showed middling to meritorious adequacy for factor analysis (all MSA\(\geq\).73).

require(psych)
KMO(big_5)
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = big_5)
## Overall MSA =  0.85
## MSA for each item = 
##   A1   A2   A3   A4   A5   C1   C2   C3   C4   C5   E1   E2   E3   E4   E5   N1 
## 0.74 0.84 0.87 0.87 0.90 0.83 0.79 0.85 0.82 0.86 0.83 0.88 0.89 0.87 0.89 0.78 
##   N2   N3   N4   N5   O1   O2   O3   O4   O5 
## 0.78 0.86 0.88 0.86 0.85 0.78 0.84 0.76 0.76

Parallel analysis

Extract a large number of factors and examine

require(psych)
fa <- fa(big_5,8, fm = 'minres', rotate='varimax', fa = 'fa')
fa
## Factor Analysis using method =  minres
## Call: fa(r = big_5, nfactors = 8, rotate = "varimax", fm = "minres", 
##     fa = "fa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##      MR2   MR3   MR5   MR1   MR6   MR4   MR8   MR7   h2   u2 com
## A1  0.09  0.04 -0.47 -0.02  0.17  0.22 -0.12  0.00 0.33 0.67 2.0
## A2  0.01  0.14  0.70  0.14  0.06 -0.03  0.14  0.19 0.58 0.42 1.4
## A3 -0.01  0.08  0.66  0.19  0.23  0.02 -0.09 -0.02 0.54 0.46 1.5
## A4 -0.07  0.21  0.43  0.11  0.07  0.11 -0.14 -0.08 0.29 0.71 2.2
## A5 -0.15  0.09  0.51  0.28  0.29  0.07 -0.04 -0.09 0.47 0.53 2.7
## C1 -0.02  0.58 -0.01  0.05  0.17 -0.08  0.12  0.00 0.38 0.62 1.3
## C2  0.06  0.67  0.08 -0.01  0.16  0.01  0.03 -0.06 0.50 0.50 1.2
## C3 -0.04  0.54  0.13 -0.01  0.02  0.03 -0.02  0.07 0.31 0.69 1.2
## C4  0.19 -0.61 -0.10 -0.12  0.15  0.30  0.15 -0.05 0.57 0.43 2.2
## C5  0.26 -0.51 -0.12 -0.12  0.02  0.09  0.37  0.01 0.50 0.50 2.7
## E1  0.00  0.03 -0.14 -0.68 -0.04  0.10 -0.06 -0.05 0.51 0.49 1.2
## E2  0.22 -0.08 -0.17 -0.65 -0.14  0.06  0.16 -0.06 0.55 0.45 1.7
## E3 -0.01  0.08  0.26  0.38  0.51 -0.04 -0.08 -0.04 0.49 0.51 2.6
## E4 -0.12  0.12  0.29  0.60  0.23  0.20 -0.07 -0.12 0.59 0.41 2.4
## E5  0.00  0.31  0.15  0.37  0.33 -0.06 -0.04  0.34 0.49 0.51 4.4
## N1  0.80 -0.07 -0.13  0.01  0.01  0.09 -0.15  0.22 0.74 0.26 1.3
## N2  0.77 -0.04 -0.12  0.00 -0.02  0.02  0.00  0.24 0.66 0.34 1.3
## N3  0.75 -0.05 -0.03 -0.04  0.02  0.04  0.06 -0.12 0.58 0.42 1.1
## N4  0.58 -0.14 -0.06 -0.30  0.00  0.00  0.19 -0.17 0.51 0.49 2.1
## N5  0.53  0.00  0.06 -0.10 -0.09  0.17  0.14 -0.18 0.39 0.61 1.8
## O1 -0.04  0.12  0.04  0.07  0.53 -0.25  0.05  0.08 0.37 0.63 1.7
## O2  0.13 -0.07  0.05  0.00 -0.13  0.54  0.08  0.02 0.34 0.66 1.3
## O3  0.00  0.10  0.10  0.20  0.53 -0.35  0.10 -0.03 0.47 0.53 2.4
## O4  0.19  0.03  0.09 -0.19  0.22 -0.20  0.31 -0.03 0.27 0.73 4.5
## O5  0.05 -0.05 -0.04 -0.03 -0.16  0.57 -0.10 -0.04 0.37 0.63 1.3
## 
##                        MR2  MR3  MR5  MR1  MR6  MR4  MR8  MR7
## SS loadings           2.66 1.97 1.94 1.90 1.36 1.11 0.48 0.39
## Proportion Var        0.11 0.08 0.08 0.08 0.05 0.04 0.02 0.02
## Cumulative Var        0.11 0.19 0.26 0.34 0.39 0.44 0.46 0.47
## Proportion Explained  0.23 0.17 0.16 0.16 0.12 0.09 0.04 0.03
## Cumulative Proportion 0.23 0.39 0.56 0.72 0.83 0.93 0.97 1.00
## 
## Mean item complexity =  2
## Test of the hypothesis that 8 factors are sufficient.
## 
## df null model =  300  with the objective function =  7.23 with Chi Square =  20163.79
## df of  the model are 128  and the objective function was  0.18 
## 
## The root mean square of the residuals (RMSR) is  0.01 
## The df corrected root mean square of the residuals is  0.02 
## 
## The harmonic n.obs is  2762 with the empirical chi square  274.09  with prob <  1.2e-12 
## The total n.obs was  2800  with Likelihood Chi Square =  502.89  with prob <  7.1e-46 
## 
## Tucker Lewis Index of factoring reliability =  0.956
## RMSEA index =  0.032  and the 90 % confidence intervals are  0.029 0.035
## BIC =  -513.09
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                    MR2  MR3  MR5  MR1  MR6  MR4
## Correlation of (regression) scores with factors   0.93 0.86 0.85 0.84 0.79 0.78
## Multiple R square of scores with factors          0.86 0.75 0.73 0.70 0.62 0.61
## Minimum correlation of possible factor scores     0.71 0.49 0.46 0.41 0.25 0.22
##                                                     MR8   MR7
## Correlation of (regression) scores with factors    0.65  0.66
## Multiple R square of scores with factors           0.42  0.44
## Minimum correlation of possible factor scores     -0.16 -0.12

The ‘elbow’ in the graph (scree test) suggests five factors. Parallel analysis suggests 6 factor solution (!). The Kaiser criterion suggests a 3 factor solution. The Velicer Map test suggests 5 factors. The very simple structure test suggests 3-4 factors. Note that you have two ‘vss’ complexity values, the second suggests 4-5 factors.

parallel <- fa.parallel(big_5, fm = 'minres', fa = 'fa')

## Parallel analysis suggests that the number of factors =  6  and the number of components =  NA
parallel
## Call: fa.parallel(x = big_5, fm = "minres", fa = "fa")
## Parallel analysis suggests that the number of factors =  6  and the number of components =  NA 
## 
##  Eigen Values of 
## 
##  eigen values of factors
##  [1]  4.26  1.91  1.26  0.95  0.71  0.26  0.01 -0.02 -0.09 -0.14 -0.16 -0.21
## [13] -0.23 -0.24 -0.26 -0.27 -0.30 -0.31 -0.33 -0.34 -0.37 -0.37 -0.43 -0.44
## [25] -0.57
## 
##  eigen values of simulated factors
##  [1]  0.22  0.16  0.13  0.12  0.10  0.09  0.08  0.06  0.05  0.04  0.03  0.01
## [13]  0.00 -0.01 -0.02 -0.03 -0.04 -0.05 -0.06 -0.08 -0.09 -0.10 -0.12 -0.13
## [25] -0.16
## 
##  eigen values of components 
##  [1] 5.04 2.74 2.11 1.83 1.54 1.11 0.85 0.81 0.73 0.70 0.68 0.66 0.63 0.60 0.56
## [16] 0.54 0.52 0.50 0.49 0.45 0.43 0.41 0.41 0.39 0.28
## 
##  eigen values of simulated components
## [1] NA
VSS(big_5, rotate= "varimax", n.obs= 2800, n=8)

## 
## Very Simple Structure
## Call: vss(x = x, n = n, rotate = rotate, diagonal = diagonal, fm = fm, 
##     n.obs = n.obs, plot = plot, title = title, use = use, cor = cor)
## VSS complexity 1 achieves a maximimum of 0.58  with  4  factors
## VSS complexity 2 achieves a maximimum of 0.74  with  5  factors
## 
## The Velicer MAP achieves a minimum of 0.01  with  5  factors 
## BIC achieves a minimum of  -513.09  with  8  factors
## Sample Size adjusted BIC achieves a minimum of  -106.39  with  8  factors
## 
## Statistics by number of factors 
##   vss1 vss2   map dof chisq     prob sqresid  fit RMSEA  BIC SABIC complex
## 1 0.49 0.00 0.024 275 11863  0.0e+00    25.9 0.49 0.123 9680 10554     1.0
## 2 0.54 0.63 0.018 251  7362  0.0e+00    18.6 0.63 0.101 5370  6168     1.2
## 3 0.56 0.70 0.017 228  5096  0.0e+00    14.6 0.71 0.087 3286  4010     1.3
## 4 0.58 0.74 0.015 206  3422  0.0e+00    11.5 0.77 0.075 1787  2441     1.4
## 5 0.54 0.74 0.015 185  1809 4.3e-264     9.4 0.81 0.056  341   928     1.5
## 6 0.54 0.71 0.016 165  1032 1.8e-125     8.3 0.84 0.043 -277   247     1.8
## 7 0.51 0.70 0.019 146   708  1.2e-74     7.9 0.85 0.037 -451    13     1.9
## 8 0.51 0.66 0.022 128   503  7.1e-46     7.4 0.85 0.032 -513  -106     2.0
##   eChisq  SRMR eCRMS  eBIC
## 1  23725 0.119 0.124 21542
## 2  12173 0.085 0.093 10180
## 3   7019 0.065 0.074  5209
## 4   3606 0.046 0.056  1971
## 5   1412 0.029 0.037   -57
## 6    649 0.020 0.027  -661
## 7    435 0.016 0.023  -724
## 8    278 0.013 0.020  -738

Extract a five factor solution.

fa_5<-fa(big_5,5, fm = 'minres', rotate='varimax', fa = 'fa')
fa_5
## Factor Analysis using method =  minres
## Call: fa(r = big_5, nfactors = 5, rotate = "varimax", fm = "minres", 
##     fa = "fa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##      MR2   MR1   MR3   MR5   MR4   h2   u2 com
## A1  0.12  0.04  0.02 -0.41 -0.08 0.19 0.81 1.3
## A2  0.03  0.21  0.14  0.62  0.07 0.45 0.55 1.4
## A3  0.01  0.32  0.11  0.64  0.06 0.52 0.48 1.6
## A4 -0.06  0.18  0.23  0.42 -0.11 0.28 0.72 2.2
## A5 -0.11  0.39  0.09  0.53  0.06 0.46 0.54 2.0
## C1  0.01  0.06  0.54  0.02  0.20 0.33 0.67 1.3
## C2  0.09  0.03  0.65  0.11  0.11 0.45 0.55 1.2
## C3 -0.02  0.02  0.55  0.12  0.00 0.32 0.68 1.1
## C4  0.25 -0.06 -0.61 -0.04 -0.11 0.45 0.55 1.5
## C5  0.30 -0.17 -0.55 -0.06  0.03 0.43 0.57 1.8
## E1  0.04 -0.57  0.04 -0.10 -0.07 0.35 0.65 1.1
## E2  0.25 -0.68 -0.09 -0.10 -0.04 0.54 0.46 1.4
## E3  0.02  0.54  0.08  0.27  0.27 0.44 0.56 2.0
## E4 -0.10  0.65  0.10  0.30 -0.08 0.53 0.47 1.6
## E5  0.03  0.50  0.32  0.09  0.21 0.40 0.60 2.2
## N1  0.77  0.07 -0.04 -0.22 -0.08 0.65 0.35 1.2
## N2  0.75  0.03 -0.03 -0.19 -0.02 0.60 0.40 1.1
## N3  0.73 -0.06 -0.07 -0.03  0.00 0.55 0.45 1.0
## N4  0.59 -0.33 -0.17  0.00  0.07 0.49 0.51 1.8
## N5  0.54 -0.15 -0.03  0.10 -0.15 0.35 0.65 1.4
## O1  0.01  0.22  0.12  0.06  0.50 0.31 0.69 1.5
## O2  0.19  0.00 -0.10  0.09 -0.45 0.26 0.74 1.5
## O3  0.02  0.30  0.08  0.13  0.59 0.46 0.54 1.7
## O4  0.23 -0.18 -0.01  0.16  0.37 0.25 0.75 2.6
## O5  0.10 -0.01 -0.06 -0.02 -0.54 0.30 0.70 1.1
## 
##                        MR2  MR1  MR3  MR5  MR4
## SS loadings           2.69 2.44 1.98 1.78 1.48
## Proportion Var        0.11 0.10 0.08 0.07 0.06
## Cumulative Var        0.11 0.21 0.28 0.36 0.41
## Proportion Explained  0.26 0.24 0.19 0.17 0.14
## Cumulative Proportion 0.26 0.49 0.69 0.86 1.00
## 
## Mean item complexity =  1.5
## Test of the hypothesis that 5 factors are sufficient.
## 
## df null model =  300  with the objective function =  7.23 with Chi Square =  20163.79
## df of  the model are 185  and the objective function was  0.65 
## 
## The root mean square of the residuals (RMSR) is  0.03 
## The df corrected root mean square of the residuals is  0.04 
## 
## The harmonic n.obs is  2762 with the empirical chi square  1392.16  with prob <  5.6e-184 
## The total n.obs was  2800  with Likelihood Chi Square =  1808.94  with prob <  4.3e-264 
## 
## Tucker Lewis Index of factoring reliability =  0.867
## RMSEA index =  0.056  and the 90 % confidence intervals are  0.054 0.058
## BIC =  340.53
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy             
##                                                    MR2  MR1  MR3  MR5  MR4
## Correlation of (regression) scores with factors   0.92 0.88 0.86 0.84 0.82
## Multiple R square of scores with factors          0.85 0.77 0.73 0.70 0.68
## Minimum correlation of possible factor scores     0.69 0.53 0.47 0.40 0.35

Export loadings.

It should be clear what they map on to… .

require(stargazer)
## Loading required package: stargazer
## 
## Please cite as:
##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
require(plyr)
## Loading required package: plyr
factor_loadings<-as.data.frame(as.matrix.data.frame(fa_5$loadings))
factor_loadings<-plyr::rename(factor_loadings, c("V1"="Neuroticism","V2"="Extraversion", "V3"="Conscientiousness", "V4"="Agreeableness", "V5"="Openness"))
stargazer(factor_loadings, summary = FALSE,out= "results_loadings_exercise.html", header=FALSE)
## 
## \begin{table}[!htbp] \centering 
##   \caption{} 
##   \label{} 
## \begin{tabular}{@{\extracolsep{5pt}} cccccc} 
## \\[-1.8ex]\hline 
## \hline \\[-1.8ex] 
##  & Neuroticism & Extraversion & Conscientiousness & Agreeableness & Openness \\ 
## \hline \\[-1.8ex] 
## 1 & $0.122$ & $0.036$ & $0.024$ & $$-$0.410$ & $$-$0.083$ \\ 
## 2 & $0.030$ & $0.206$ & $0.144$ & $0.615$ & $0.069$ \\ 
## 3 & $0.006$ & $0.320$ & $0.106$ & $0.637$ & $0.058$ \\ 
## 4 & $$-$0.060$ & $0.183$ & $0.230$ & $0.423$ & $$-$0.106$ \\ 
## 5 & $$-$0.112$ & $0.394$ & $0.089$ & $0.533$ & $0.062$ \\ 
## 6 & $0.006$ & $0.062$ & $0.536$ & $0.023$ & $0.195$ \\ 
## 7 & $0.088$ & $0.030$ & $0.647$ & $0.108$ & $0.105$ \\ 
## 8 & $$-$0.021$ & $0.020$ & $0.550$ & $0.120$ & $$-$0.004$ \\ 
## 9 & $0.254$ & $$-$0.063$ & $$-$0.607$ & $$-$0.040$ & $$-$0.113$ \\ 
## 10 & $0.296$ & $$-$0.172$ & $$-$0.553$ & $$-$0.055$ & $0.034$ \\ 
## 11 & $0.042$ & $$-$0.574$ & $0.040$ & $$-$0.100$ & $$-$0.068$ \\ 
## 12 & $0.246$ & $$-$0.680$ & $$-$0.090$ & $$-$0.102$ & $$-$0.036$ \\ 
## 13 & $0.021$ & $0.540$ & $0.079$ & $0.265$ & $0.266$ \\ 
## 14 & $$-$0.099$ & $0.645$ & $0.100$ & $0.297$ & $$-$0.084$ \\ 
## 15 & $0.032$ & $0.501$ & $0.317$ & $0.089$ & $0.207$ \\ 
## 16 & $0.769$ & $0.075$ & $$-$0.039$ & $$-$0.217$ & $$-$0.085$ \\ 
## 17 & $0.748$ & $0.032$ & $$-$0.030$ & $$-$0.194$ & $$-$0.019$ \\ 
## 18 & $0.734$ & $$-$0.057$ & $$-$0.066$ & $$-$0.032$ & $$-$0.0002$ \\ 
## 19 & $0.585$ & $$-$0.333$ & $$-$0.173$ & $$-$0.003$ & $0.066$ \\ 
## 20 & $0.541$ & $$-$0.154$ & $$-$0.034$ & $0.103$ & $$-$0.146$ \\ 
## 21 & $0.005$ & $0.219$ & $0.119$ & $0.064$ & $0.496$ \\ 
## 22 & $0.189$ & $0.004$ & $$-$0.097$ & $0.086$ & $$-$0.453$ \\ 
## 23 & $0.024$ & $0.303$ & $0.083$ & $0.132$ & $0.590$ \\ 
## 24 & $0.231$ & $$-$0.182$ & $$-$0.014$ & $0.156$ & $0.374$ \\ 
## 25 & $0.096$ & $$-$0.006$ & $$-$0.057$ & $$-$0.018$ & $$-$0.536$ \\ 
## \hline \\[-1.8ex] 
## \end{tabular} 
## \end{table}

TLI and RMSEA.

While the five factor model could be considered a close fit in RMSEA (.056), it was not in terms of TLI (.867).

Plots.

Hurray, this sort of looks like the ‘big 5’!

require(GPArotation)
fa.diagram(fa_5, marg=c(.01,.01,1,.01))

require(semPlot)
semplot1<-semPlotModel(fa_5$loadings)
semPaths(semplot1, what="std", layout="circle", nCharNodes = 6)

plot(fa_5,labels=names(big_5),cex=.7, ylim=c(-.1,1)) 

The end!