Introduction

In this document, I built my network to add to my website. It is a modified version of this tutorial by Matti Vuorre.

You will need the packages to be installed (though not all used) and the data to be downloaded in the same folder as this script. Pay attention to where it says IMPORTANT.

You also need to install psyarxivr as not on CRAN yet - but it might be by time this document is bouncing around the interwebs. Use: pak::pkg_install("mvuorre/psyarxivr").

The corresponding .rmd can be downloaded here. With hindsight should have been PsyArxiv in the title.

Load packages

library(pak)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(patchwork)
library(jsonlite)
## 
## Attaching package: 'jsonlite'
## 
## The following object is masked from 'package:purrr':
## 
##     flatten
library(psyarxivr)
library(ggthemes)
library(ggrepel)
library(wesanderson)
library(ggsci)
library(RColorBrewer)
library(scales)
## 
## Attaching package: 'scales'
## 
## The following object is masked from 'package:purrr':
## 
##     discard
## 
## The following object is masked from 'package:readr':
## 
##     col_factor
library(sysfonts)
library(showtext)
## Loading required package: showtextdb
library(papaja)
## Loading required package: tinylabels
library(tidygraph)
## 
## Attaching package: 'tidygraph'
## 
## The following object is masked from 'package:stats':
## 
##     filter
library(ggraph)
library(Cairo)

Fonts

IMPORTANT: This code will not run if you don’t have the appropriate font installed

Ensure you have Roboto installed and available.

On Mac add them to your fontbooks and make sure they are available to R.

Some information here:

https://www.listendata.com/2019/06/create-infographics-with-r.html https://stackoverflow.com/questions/49482512/waffle-font-family-not-found-in-windows-font-database https://community.rstudio.com/t/problem-with-fonts-in-mac/119124/4 No warnings, no messages are displayed here. This might take some time.

sysfonts::font_add_google("Roboto")

PsyArxiv code

Below takes Matti Vuorre’s code,… .

# Parse contributors JSON variable into its own table with preprint ids
contributors <- preprints |>
  # Remove preprints with no contributor data and non-latest versions
  filter(contributors != "[]", is_latest_version == 1) |>
  # Select required variables only
  select(id, contributors) |>
  # Convert JSON into data frames in a list-column
  mutate(
    contributors = map(
      contributors,
      fromJSON
    )
  )

# Unnest into a table of contributors and clean
contributors <- contributors |>
  unnest(contributors) |>
  # Only include bibliographic authors
  filter(bibliographic) |>
  # Remove some other contributor variables and rename
  select(id, name = full_name) |>
  # Take out unnamed contributors
  filter(name != "")

# Calculate total number of contributors
contributors_total <- nrow(contributors)

So for some bizarre reason I have an additional space in my name.

But hey it’s me!

my_coauthors <- contributors |>
  # Retain all preprints where any of the authors was me
  filter(any(name == "Thomas V.  Pollet"), .by = id)

This bit adds the collaborators of my collabs.

my_coauthors_coauthors <- contributors |>
  # Retain all preprints where any author was my coauthor
  filter(any(name %in% unique(my_coauthors$name)), .by = id)

nrow(my_coauthors_coauthors)
## [1] 1371
head(my_coauthors_coauthors)

Graph time

# Get all pairs of co-authors for each paper and count collaborations
edges <- my_coauthors_coauthors |>
  group_by(id) |>
  # Create all pairwise combinations within each paper
  reframe(expand.grid(
    author1 = name,
    author2 = name,
    stringsAsFactors = FALSE
  )) |>
  # Remove self-loops and order pairs for undirected edges
  filter(author1 < author2) |>
  rename(from = author1, to = author2)
head(edges)
# Create graph with key metrics
graph <- edges |>
  as_tbl_graph(directed = FALSE) |>
  mutate(
    distance = factor(node_distance_from(name == "Thomas V.  Pollet"))
  )

graph
## # A tbl_graph: 589 nodes and 3496 edges
## #
## # An undirected multigraph with 1 component
## #
## # Node Data: 589 × 2 (active)
##    name                  distance
##    <chr>                 <fct>   
##  1 Connor Malcolm        1       
##  2 Kris McCarty          1       
##  3 Sam G. B. Roberts     1       
##  4 Hans IJzerman         1       
##  5 Jorick Post           2       
##  6 Lison Neyroud         2       
##  7 Michel Schrama        2       
##  8 Rémi Courset          2       
##  9 Kate Ratliff          2       
## 10 Michelangelo Vianello 2       
## # ℹ 579 more rows
## #
## # Edge Data: 3,496 × 3
##    from    to id      
##   <int> <int> <chr>   
## 1     1     3 2583s_v1
## 2     2     3 2583s_v1
## 3     1     2 2583s_v1
## # ℹ 3,493 more rows

Prefer 666 over 999… .

set.seed(666)
graph |>
  # Create a ggplot with appropriate mappings for graph data
  ggraph(layout = "fr") +
  # Show edges
  geom_edge_link() +
  # Show nodes
  geom_node_point() +
  # A blank theme
  theme_graph()

Slightly tweaked below… . Different font and colours.

set.seed(666)
ggraph(graph, layout = "fr") +
  # Make edges less prominent
  geom_edge_link(
    linewidth = 0.2,
    alpha = 0.4,
    color = "gray70"
  ) +
  # Nodes further from me are smaller
  geom_node_point(
    aes(size = distance, color = distance)
  ) +
  # Add text to my (bold) & coauthors' (plain) nodes
  geom_node_text(
    data = . %>% filter(distance != 2),
    aes(
      label = name,
      fontface = ifelse(name == "Thomas V.  Pollet", "bold", "plain")
    ),
    repel = TRUE,
    size = 2.25,
    family = "Roboto"  
  ) +
  # Specify sizes, colors, and theme options
  scale_size_manual(values = c(2, 1, 0.5)) +
  scale_color_manual(
    values = wes_palette("GrandBudapest1", 3, type = "continuous") 
  ) +
  theme_graph() +
  theme(
    legend.position = "none",
    text = element_text(family = "Roboto")  
  )
## Warning: ggrepel: 42 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Now export this via Cairo and then convert .eps into .pdf via Texshop or the like.

cairo_ps("Psy_Arxiv_network.eps", height= 10, width= 10, family = "Roboto")
set.seed(666)
ggraph(graph, layout = "fr") +
  # Make edges less prominent
  geom_edge_link(
    linewidth = 0.2,
    alpha = 0.4,
    color = "gray70"
  ) +
  # Nodes further from me are smaller
  geom_node_point(
    aes(size = distance, color = distance)
  ) +
  # Add text to my (bold) & coauthors' (plain) nodes
  geom_node_text(
    data = . %>% filter(distance != 2),
    aes(
      label = name,
      fontface = ifelse(name == "Thomas V.  Pollet", "bold", "plain")
    ),
    repel = TRUE,
    size = 3.2,
    family = "Roboto"  
  ) +
  # Specify sizes, colors, and theme options
  scale_size_manual(values = c(2, 1, 0.5)) +
  scale_color_manual(
    values = wes_palette("GrandBudapest1", 3, type = "continuous") 
  ) +
  theme_graph() +
  theme(
    legend.position = "none",
    text = element_text(family = "Roboto")  
  )
## Warning: ggrepel: 31 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
dev.off()
## quartz_off_screen 
##                 2

SessionInfo

sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.5
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/London
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] Cairo_1.6-5          ggraph_2.2.2         tidygraph_1.3.1     
##  [4] papaja_0.1.3         tinylabels_0.2.5     showtext_0.9-7      
##  [7] showtextdb_3.0       sysfonts_0.8.9       scales_1.4.0        
## [10] RColorBrewer_1.1-3   ggsci_3.2.0          wesanderson_0.3.7   
## [13] ggrepel_0.9.6        ggthemes_5.1.0       psyarxivr_0.0.0.9000
## [16] jsonlite_2.0.0       patchwork_1.3.2      lubridate_1.9.4     
## [19] forcats_1.0.0        stringr_1.5.1        dplyr_1.1.4         
## [22] purrr_1.1.0          readr_2.1.5          tidyr_1.3.1         
## [25] tibble_3.3.0         ggplot2_3.5.2        tidyverse_2.0.0     
## [28] pak_0.9.0           
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.2.1   viridisLite_0.4.2  farver_2.1.2       viridis_0.6.5     
##  [5] fastmap_1.2.0      TH.data_1.1-3      tweenr_2.0.3       bayestestR_0.16.1 
##  [9] digest_0.6.37      estimability_1.5.1 timechange_0.3.0   lifecycle_1.0.4   
## [13] survival_3.8-3     magrittr_2.0.3     compiler_4.5.1     rlang_1.1.6       
## [17] sass_0.4.10        tools_4.5.1        igraph_2.1.4       utf8_1.2.6        
## [21] yaml_2.3.10        knitr_1.50         labeling_0.4.3     graphlayouts_1.2.2
## [25] curl_6.4.0         multcomp_1.4-28    withr_3.0.2        grid_4.5.1        
## [29] polyclip_1.10-7    datawizard_1.2.0   xtable_1.8-4       emmeans_1.11.2    
## [33] MASS_7.3-65        insight_1.4.0      cli_3.6.5          mvtnorm_1.3-3     
## [37] rmarkdown_2.29     generics_0.1.4     rstudioapi_0.17.1  tzdb_0.5.0        
## [41] parameters_0.28.0  cachem_1.1.0       ggforce_0.5.0      splines_4.5.1     
## [45] effectsize_1.0.1   vctrs_0.6.5        Matrix_1.7-3       sandwich_3.1-1    
## [49] hms_1.1.3          jquerylib_0.1.4    glue_1.8.0         codetools_0.2-20  
## [53] stringi_1.8.7      gtable_0.3.6       pillar_1.11.0      htmltools_0.5.8.1 
## [57] R6_2.6.1           evaluate_1.0.4     lattice_0.22-7     memoise_2.0.1     
## [61] bslib_0.9.0        Rcpp_1.1.0         coda_0.19-4.1      gridExtra_2.3     
## [65] xfun_0.52          zoo_1.8-14         pkgconfig_2.0.3

Packages

We are grateful to all authors.

r_refs(file = "r-analysis-references.bib")
my_citation <- cite_r(file = "r-analysis-references.bib")

R (Version 4.5.1; R Core Team 2025) and the R-packages Cairo (Version 1.6.5; Urbanek and Horner 2025), dplyr (Version 1.1.4; Wickham et al. 2023), forcats (Version 1.0.0; Wickham 2023a), ggplot2 (Version 3.5.2; Wickham 2016), ggraph (Version 2.2.2; Pedersen 2025a), ggrepel (Version 0.9.6; Slowikowski 2024), ggsci (Version 3.2.0; Xiao 2024), ggthemes (Version 5.1.0; Arnold 2024), jsonlite (Version 2.0.0; Ooms 2014), lubridate (Version 1.9.4; Grolemund and Wickham 2011), pak (Version 0.9.0; Csárdi and Hester 2025), papaja (Version 0.1.3; Aust and Barth 2024), patchwork (Version 1.3.2; Pedersen 2025b), psyarxivr (Version 0.0.0.9000; Vuorre 2025), purrr (Version 1.1.0; Wickham and Henry 2025), RColorBrewer (Version 1.1.3; Neuwirth 2022), readr (Version 2.1.5; Wickham, Hester, and Bryan 2024), scales (Version 1.4.0; Wickham, Pedersen, and Seidel 2025), showtext (Version 0.9.7; Qiu and See file AUTHORS for details. 2024b; Qiu and See file AUTHORS for details. 2020), showtextdb (Version 3.0; Qiu and See file AUTHORS for details. 2020), stringr (Version 1.5.1; Wickham 2023b), sysfonts (Version 0.8.9; Qiu and See file AUTHORS for details. 2024a), tibble (Version 3.3.0; Müller and Wickham 2025), tidygraph (Version 1.3.1; Pedersen 2024), tidyr (Version 1.3.1; Wickham, Vaughan, and Girlich 2024), tidyverse (Version 2.0.0; Wickham et al. 2019), tinylabels (Version 0.2.5; Barth 2025) and wesanderson (Version 0.3.7; Ram and Wickham 2023)

References

Arnold, Jeffrey B. 2024. Ggthemes: Extra Themes, Scales and Geoms for ’Ggplot2’. https://doi.org/10.32614/CRAN.package.ggthemes.
Aust, Frederik, and Marius Barth. 2024. papaja: Prepare Reproducible APA Journal Articles with R Markdown. https://doi.org/10.32614/CRAN.package.papaja.
Barth, Marius. 2025. tinylabels: Lightweight Variable Labels. https://doi.org/10.32614/CRAN.package.tinylabels.
Csárdi, Gábor, and Jim Hester. 2025. Pak: Another Approach to Package Installation. https://doi.org/10.32614/CRAN.package.pak.
Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. https://www.jstatsoft.org/v40/i03/.
Müller, Kirill, and Hadley Wickham. 2025. Tibble: Simple Data Frames. https://doi.org/10.32614/CRAN.package.tibble.
Neuwirth, Erich. 2022. RColorBrewer: ColorBrewer Palettes. https://doi.org/10.32614/CRAN.package.RColorBrewer.
Ooms, Jeroen. 2014. “The Jsonlite Package: A Practical and Consistent Mapping Between JSON Data and r Objects.” arXiv:1403.2805 [Stat.CO]. https://arxiv.org/abs/1403.2805.
Pedersen, Thomas Lin. 2024. Tidygraph: A Tidy API for Graph Manipulation. https://doi.org/10.32614/CRAN.package.tidygraph.
———. 2025a. Ggraph: An Implementation of Grammar of Graphics for Graphs and Networks. https://doi.org/10.32614/CRAN.package.ggraph.
———. 2025b. Patchwork: The Composer of Plots. https://doi.org/10.32614/CRAN.package.patchwork.
Qiu, Yixuan, and authors/contributors of the included fonts. See file AUTHORS for details. 2020. Showtextdb: Font Files for the ’Showtext’ Package. https://doi.org/10.32614/CRAN.package.showtextdb.
———. 2024a. Sysfonts: Loading Fonts into r. https://doi.org/10.32614/CRAN.package.sysfonts.
Qiu, Yixuan, and authors/contributors of the included software. See file AUTHORS for details. 2024b. Showtext: Using Fonts More Easily in r Graphs. https://doi.org/10.32614/CRAN.package.showtext.
R Core Team. 2025. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Ram, Karthik, and Hadley Wickham. 2023. Wesanderson: A Wes Anderson Palette Generator. https://doi.org/10.32614/CRAN.package.wesanderson.
Slowikowski, Kamil. 2024. Ggrepel: Automatically Position Non-Overlapping Text Labels with ’Ggplot2’. https://doi.org/10.32614/CRAN.package.ggrepel.
Urbanek, Simon, and Jeffrey Horner. 2025. Cairo: R Graphics Device Using Cairo Graphics Library for Creating High-Quality Bitmap (PNG, JPEG, TIFF), Vector (PDF, SVG, PostScript) and Display (X11 and Win32) Output. https://doi.org/10.32614/CRAN.package.Cairo.
Vuorre, Matti. 2025. Psyarxivr: PsyArXiv Preprints Data. https://github.com/mvuorre/psyarxivr.
Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
———. 2023a. Forcats: Tools for Working with Categorical Variables (Factors). https://doi.org/10.32614/CRAN.package.forcats.
———. 2023b. Stringr: Simple, Consistent Wrappers for Common String Operations. https://doi.org/10.32614/CRAN.package.stringr.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. Dplyr: A Grammar of Data Manipulation. https://doi.org/10.32614/CRAN.package.dplyr.
Wickham, Hadley, and Lionel Henry. 2025. Purrr: Functional Programming Tools. https://doi.org/10.32614/CRAN.package.purrr.
Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2024. Readr: Read Rectangular Text Data. https://doi.org/10.32614/CRAN.package.readr.
Wickham, Hadley, Thomas Lin Pedersen, and Dana Seidel. 2025. Scales: Scale Functions for Visualization. https://doi.org/10.32614/CRAN.package.scales.
Wickham, Hadley, Davis Vaughan, and Maximilian Girlich. 2024. Tidyr: Tidy Messy Data. https://doi.org/10.32614/CRAN.package.tidyr.
Xiao, Nan. 2024. Ggsci: Scientific Journal and Sci-Fi Themed Color Palettes for ’Ggplot2’. https://doi.org/10.32614/CRAN.package.ggsci.