Use the following code below to generate a correlation matrix to use (you’ll need to load lavaan).
The data comes from this paper. A study testing the Theory of Planned Behaviour when applying to graduate school for 131 students.
## Loading required package: lavaan
## This is lavaan 0.6-17
## lavaan is FREE software! Please report any bugs.
data<-lav_matrix_lower2full(c(1,.47,1,.66,.50,1,.77,.41,.46,1,.52,.38,.50,.50,1))
rownames(data)<-colnames(data)<-c("att",'sn','pbc','int','beh') # a matrix.
Reconstruct the model from the article. (Note that there are no (!) latent variables)
Discuss the fit. Make a sem plot.
What happens when you remove the indirect path?
This exercise is taken from Huang 2016.
Have a look at the article this is the basic model they propose. The equations are corresponding to Figure 1, they are all manifest variables, even though denoted with ellipses rather than rectangles in the original paper.
Note: We have a correlation matrix, but the command used refers to covariance matrix.
We can also directly request an \(R^2\) measure for this model.
## lavaan 0.6.17 ended normally after 2 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 7
##
## Number of observations 131
##
## Model Test User Model:
##
## Test statistic 1.834
## Degrees of freedom 2
## P-value (Chi-square) 0.400
##
## Model Test Baseline Model:
##
## Test statistic 177.710
## Degrees of freedom 7
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.003
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -282.820
## Loglikelihood unrestricted model (H1) -281.903
##
## Akaike (AIC) 579.640
## Bayesian (BIC) 599.766
## Sample-size adjusted Bayesian (SABIC) 577.626
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.000
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.169
## P-value H_0: RMSEA <= 0.050 0.510
## P-value H_0: RMSEA >= 0.080 0.355
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.019
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int ~
## att 0.804 0.075 10.713 0.000 0.804 0.804
## sn 0.090 0.065 1.381 0.167 0.090 0.090
## pbc -0.116 0.076 -1.512 0.131 -0.116 -0.116
## beh ~
## int 0.342 0.080 4.292 0.000 0.342 0.342
## pbc 0.342 0.080 4.292 0.000 0.342 0.342
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .int 0.394 0.049 8.093 0.000 0.394 0.397
## .beh 0.653 0.081 8.093 0.000 0.653 0.658
##
## R-Square:
## Estimate
## int 0.603
## beh 0.342
Let’s make a plot… . Our plot looks different but the values pretty much map onto to those of the original article.
## Loading required package: semPlot
## Loading required package: qgraph
We could calculate the direct, indirect and total paths from this graph. But we could not formally test the mediation.
This model actually explains more variance and is a better fit based on AIC and on BIC. Depending on your view, a model with just main effects would thus be parsimonious to one with the indirect pathway.
direct_model1<-sem(direct_model,sample.cov=data,sample.nobs=131)
summary(direct_model1,standardized=T,fit=T,rsquare=T)
## lavaan 0.6.17 ended normally after 1 iteration
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 5
##
## Number of observations 131
##
## Model Test User Model:
##
## Test statistic 0.000
## Degrees of freedom 0
##
## Model Test Baseline Model:
##
## Test statistic 56.757
## Degrees of freedom 4
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.000
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -157.001
## Loglikelihood unrestricted model (H1) -157.001
##
## Akaike (AIC) 324.001
## Bayesian (BIC) 338.377
## Sample-size adjusted Bayesian (SABIC) 322.563
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.000
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.000
## P-value H_0: RMSEA <= 0.050 NA
## P-value H_0: RMSEA >= 0.080 NA
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.000
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## beh ~
## int 0.266 0.112 2.380 0.017 0.266 0.266
## pbc 0.270 0.099 2.735 0.006 0.270 0.270
## att 0.094 0.131 0.718 0.473 0.094 0.094
## sn 0.092 0.084 1.098 0.272 0.092 0.092
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .beh 0.643 0.080 8.093 0.000 0.643 0.648
##
## R-Square:
## Estimate
## beh 0.352
model2<-'
int~att+sn+a*pbc
beh~b*int+c*pbc
indirect := a*b # we care about this one most.
direct := c
total := c + a*b'
Unfortunately we cannot do the bootstrapping without the raw data. The indirect effect is not significant but we are basing this on the correlations… . As you might recall the product-moment approach is not very powerful and our sample is comparatively small.
path_model2<-sem(model2, sample.cov=data, sample.nobs=131)
summary(path_model2,standardized=T,fit=T,rsquare=T)
## lavaan 0.6.17 ended normally after 2 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 7
##
## Number of observations 131
##
## Model Test User Model:
##
## Test statistic 1.834
## Degrees of freedom 2
## P-value (Chi-square) 0.400
##
## Model Test Baseline Model:
##
## Test statistic 177.710
## Degrees of freedom 7
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.003
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -282.820
## Loglikelihood unrestricted model (H1) -281.903
##
## Akaike (AIC) 579.640
## Bayesian (BIC) 599.766
## Sample-size adjusted Bayesian (SABIC) 577.626
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.000
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.169
## P-value H_0: RMSEA <= 0.050 0.510
## P-value H_0: RMSEA >= 0.080 0.355
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.019
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int ~
## att 0.804 0.075 10.713 0.000 0.804 0.804
## sn 0.090 0.065 1.381 0.167 0.090 0.090
## pbc (a) -0.116 0.076 -1.512 0.131 -0.116 -0.116
## beh ~
## int (b) 0.342 0.080 4.292 0.000 0.342 0.342
## pbc (c) 0.342 0.080 4.292 0.000 0.342 0.342
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .int 0.394 0.049 8.093 0.000 0.394 0.397
## .beh 0.653 0.081 8.093 0.000 0.653 0.658
##
## R-Square:
## Estimate
## int 0.603
## beh 0.342
##
## Defined Parameters:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## indirect -0.040 0.028 -1.426 0.154 -0.040 -0.040
## direct 0.342 0.080 4.292 0.000 0.342 0.342
## total 0.303 0.088 3.426 0.001 0.303 0.303
We cannot use the more powerful bootstrapping approach based on the correlation matrix.
We can use the ‘gendata’ package to simulate a raw dataset which would give the correlations below.
## X1 X2 X3 X4 X5
## X1 1.0000000 0.4438255 0.6504578 0.7793393 0.6068527
## X2 0.4438255 1.0000000 0.5407487 0.3854558 0.3469281
## X3 0.6504578 0.5407487 1.0000000 0.4609209 0.5539733
## X4 0.7793393 0.3854558 0.4609209 1.0000000 0.5318054
## X5 0.6068527 0.3469281 0.5539733 0.5318054 1.0000000
Add the labels.
Bootstrap. This will take a lot of time.
Contrary, to Huang, we find no evidence for a significant mediation effect.
results_raw<-sem(model2,data=raw,se="bootstrap",bootstrap=1000)
summary(results_raw,standardized=T,fit=T,rsquare=T)
## lavaan 0.6.17 ended normally after 1 iteration
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 7
##
## Number of observations 131
##
## Model Test User Model:
##
## Test statistic 4.804
## Degrees of freedom 2
## P-value (Chi-square) 0.091
##
## Model Test Baseline Model:
##
## Test statistic 198.027
## Degrees of freedom 7
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.985
## Tucker-Lewis Index (TLI) 0.949
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -290.215
## Loglikelihood unrestricted model (H1) -287.812
##
## Akaike (AIC) 594.429
## Bayesian (BIC) 614.555
## Sample-size adjusted Bayesian (SABIC) 592.415
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.103
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.226
## P-value H_0: RMSEA <= 0.050 0.164
## P-value H_0: RMSEA >= 0.080 0.720
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.020
##
## Parameter Estimates:
##
## Standard errors Bootstrap
## Number of requested bootstrap draws 1000
## Number of successful bootstrap draws 1000
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int ~
## att 0.874 0.069 12.703 0.000 0.874 0.817
## sn 0.090 0.067 1.346 0.178 0.090 0.086
## pbc (a) -0.130 0.076 -1.723 0.085 -0.130 -0.117
## beh ~
## int (b) 0.351 0.073 4.782 0.000 0.351 0.351
## pbc (c) 0.434 0.077 5.638 0.000 0.434 0.392
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .int 0.431 0.054 8.040 0.000 0.431 0.384
## .beh 0.668 0.081 8.257 0.000 0.668 0.596
##
## R-Square:
## Estimate
## int 0.616
## beh 0.404
##
## Defined Parameters:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## indirect -0.046 0.027 -1.666 0.096 -0.046 -0.041
## direct 0.434 0.077 5.635 0.000 0.434 0.392
## total 0.389 0.081 4.825 0.000 0.389 0.351
Huang, F. (2016). Path analysis with R. http://web.missouri.edu/~huangf/data/mvnotes/Using_R_for_path_analysis.html, archived here: https://web.archive.org/web/20200716131821/http://web.missouri.edu/~huangf/data/mvnotes/Using_R_for_path_analysis.html
THE END!
## R version 4.3.2 (2023-10-31)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Ventura 13.4
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: Europe/London
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] gendata_1.2.0 qgraph_1.9.8 semPlot_1.1.6 lavaan_0.6-17
##
## loaded via a namespace (and not attached):
## [1] psych_2.4.1 tidyselect_1.2.0 dplyr_1.1.3 fastmap_1.1.1
## [5] OpenMx_2.21.8 XML_3.99-0.15 digest_0.6.35 rpart_4.1.21
## [9] mi_1.1 lifecycle_1.0.4 cluster_2.1.4 magrittr_2.0.3
## [13] compiler_4.3.2 rlang_1.1.3 Hmisc_5.1-1 sass_0.4.9
## [17] tools_4.3.2 igraph_2.0.3 utf8_1.2.4 yaml_2.3.8
## [21] data.table_1.14.10 knitr_1.45 htmlwidgets_1.6.2 mnormt_2.1.1
## [25] plyr_1.8.9 abind_1.4-5 foreign_0.8-85 nnet_7.3-19
## [29] grid_4.3.2 stats4_4.3.2 fansi_1.0.6 xtable_1.8-4
## [33] colorspace_2.1-0 ggplot2_3.4.4 gtools_3.9.4 scales_1.3.0
## [37] MASS_7.3-60 cli_3.6.2 rmarkdown_2.26 generics_0.1.3
## [41] RcppParallel_5.1.7 rstudioapi_0.16.0 reshape2_1.4.4 pbapply_1.7-2
## [45] minqa_1.2.6 cachem_1.0.8 stringr_1.5.1 splines_4.3.2
## [49] parallel_4.3.2 base64enc_0.1-3 vctrs_0.6.5 boot_1.3-28.1
## [53] Matrix_1.6-1.1 jsonlite_1.8.8 carData_3.0-5 glasso_1.11
## [57] Formula_1.2-5 htmlTable_2.4.2 jpeg_0.1-10 jquerylib_0.1.4
## [61] glue_1.7.0 nloptr_2.0.3 stringi_1.8.3 sem_3.1-15
## [65] gtable_0.3.4 quadprog_1.5-8 lme4_1.1-35 munsell_0.5.1
## [69] tibble_3.2.1 lisrelToR_0.1.5 pillar_1.9.0 htmltools_0.5.8.1
## [73] R6_2.5.1 evaluate_0.23 pbivnorm_0.6.0 lattice_0.21-9
## [77] highr_0.10 png_0.1-8 backports_1.4.1 rockchalk_1.8.157
## [81] kutils_1.73 openxlsx_4.2.5.2 arm_1.13-1 corpcor_1.6.10
## [85] bslib_0.7.0 Rcpp_1.0.11 zip_2.3.0 fdrtool_1.2.17
## [89] coda_0.19-4 gridExtra_2.3 nlme_3.1-163 checkmate_2.3.0
## [93] xfun_0.43 pkgconfig_2.0.3