Introduction.

This is a worksheet for use with Lecture 3.

You have a video of me narrating these slides. Note that there are minor discrepancies between the current set of slides and the one in the video. The slide numbers refer to the current set. I do not cover every single slide but you can code along!

If you answer correctly the colour of the box will change when correct.

Slides

Important: Haven has been updated and now reads in SPSS factors somewhat differently Key discrepancy GRP --> GRP_fact

ANOVA (Slide 5)

You should have covered ANOVA in your undergraduate studies. Have a look over your notes or have a look at this summary page

Other measures of variation. (Slide 6)

Can you name another measure of variation?

If \(\sigma^2\) is variance, then what is \(\sigma\) the notation of?

  1. Means
  2. Z-score
  3. Standard Error
  4. Standard Deviation

My answer:

Why would you need an ANOVA (Slide 7)

When does a researcher risk a Type I error?

  1. Anytime the decision is ‘fail to reject’.
  2. Anytime H0 is rejected.
  3. Anytime H1 is rejected.
  4. All of the above options.

My answer:

Assumptions (Slide 11)

Which (scrabble-wining) word also describes the assumption of homogeneity of variances?

Have a look at this article on assumptions of linear regression. This paper holds a clue.

Time for a new dataset (Slide 12)

It is under 'SPSS data set'. Use right click --> save as.

Load data (Slide 13)

You can get the data from here.

user_na = T --> Explore what this does in the haven manual.

Skim (Slide 14)

Only a small amount of output is printed. Skim the data in your RMarkdown

What is the maximum for Neuroticism (N)?

p stands for percentile, p0 is the minimum.

What do you think? (Slide 20)

Discuss with your partner, did you reach the same conclusion - why not?

The largest deviation from normality is in ... 1. Asian Americans 2. European Americans 3. Asian Internationals 4. None of the above - i.e. all show equal deviations.

My answer:

Central limit theorem (CLT) (Slide 21).

Perhaps, I skimmed over the CLT too fast. See if you can answer the below question.

The Central Limit theorem (CLT) for a sample mean is a critical result because ...

  1. it states that for large sample sizes, the population distribution is approximately normal.
  2. it states that for large sample sizes, the sample is approximately normal.
  3. it states that for any population, the sampling distribution is normal regardless of sample size.
  4. it states that for large sample sizes, the sampling distribution is approximately normal regardless of the population distribution.
  5. it states that for any sample size, the sampling distribution is normal.

The central limit theorem should be covered in your undergraduate course.

You can find a very brief summary here

My answer:

Levene's test (Slide 27)

What is the F-value for a Levene's test with Neuroticism as the dependent variable?

(3 decimals)

Try it yourself (Slide 30)

Discuss your results with your partner? Are the assumptions upheld?

  • What is the F-value for Levene's test?

(4 decimals)

Ez package (Slide 36)

Conduct an 'ez ANOVA' with Neuroticism as the dependent variable. Which of the following statements is correct.

  1. p = .06048906
  2. (2,202) degrees of freedom are used
  3. F value for ANOVA > 1
  4. \(\eta^2_g\) < .006
  5. Levene's test is statistically significant

My answer:

Store results (Slide 40)

What is the 90% Confidence interval for \(\eta^2_p\) on Neuroticism?

My answer:

(2 decimals and square brackets needed; Example: [.02, .31])

Note that usually we would replace .00 with <.01 as .00 would imply inclusive of 0... .

Try it yourself (Slide 46)

Work with your partner to solve this.

med1way WRS2 (Slide 51)

Run this analysis with Extraversion as the dependent.

The F-value = (4 decimals).

Permutation tests (Slide 52)

Note that for the coin package, you might need an additional package to be installed called TH.data. After this is installed the coin package should work just fine.

Order invariant: SPSS uses Type III (Slide 58)

Run an ANCOVA with group and Extraversion as predictors and Neuroticism as the dependent variable.

What is the p-value for the Extraversion effect (round to the 4th decimal, use scientific rounding)?

You have the code on the slide. Remember the dependent variable is put before the ~

Remember to not put in the lead 0.

Homogeneity of covariances: Box's M test. (Slide 69)

Note that the 'biotools' package currently does not work on MAC (it requires BWidget, which is only available on windows). Here is yet another alternative route

This is an advanced topic but should you need it do check out this paper (paywall) . (You need to be on campus to access)

Exercise (Slide 74)

Complete the exercise and submit via Blackboard!

Going further.

Session Info.

Thanks to Lisa DeBruine for the webexercises package. Please see general disclaimer.

sessionInfo()
## R version 4.3.2 (2023-10-31)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Ventura 13.4
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/London
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] webexercises_1.1.0
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.33     R6_2.5.1          fastmap_1.1.1     xfun_0.41        
##  [5] cachem_1.0.8      knitr_1.45        htmltools_0.5.7   rmarkdown_2.25   
##  [9] cli_3.6.1         sass_0.4.7        jquerylib_0.1.4   compiler_4.3.2   
## [13] rstudioapi_0.15.0 tools_4.3.2       evaluate_0.23     bslib_0.5.1      
## [17] yaml_2.3.7        jsonlite_1.8.7    rlang_1.1.1

The end...