In this document, I built my network to add to my website. It is a modified version of this tutorial by Matti Vuorre.
You will need the packages to be installed (though not all used) and the data to be downloaded in the same folder as this script. Pay attention to where it says IMPORTANT.
You also need to install psyarxivr
as not on CRAN yet -
but it might be by time this document is bouncing around the interwebs.
Use: pak::pkg_install("mvuorre/psyarxivr")
.
The corresponding .rmd can be downloaded here. With hindsight should have been PsyArxiv in the title.
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
##
## Attaching package: 'jsonlite'
##
## The following object is masked from 'package:purrr':
##
## flatten
library(psyarxivr)
library(ggthemes)
library(ggrepel)
library(wesanderson)
library(ggsci)
library(RColorBrewer)
library(scales)
##
## Attaching package: 'scales'
##
## The following object is masked from 'package:purrr':
##
## discard
##
## The following object is masked from 'package:readr':
##
## col_factor
## Loading required package: showtextdb
## Loading required package: tinylabels
##
## Attaching package: 'tidygraph'
##
## The following object is masked from 'package:stats':
##
## filter
IMPORTANT: This code will not run if you don’t have the appropriate font installed
Ensure you have Roboto installed and available.
On Mac add them to your fontbooks and make sure they are available to R.
Some information here:
https://www.listendata.com/2019/06/create-infographics-with-r.html https://stackoverflow.com/questions/49482512/waffle-font-family-not-found-in-windows-font-database https://community.rstudio.com/t/problem-with-fonts-in-mac/119124/4 No warnings, no messages are displayed here. This might take some time.
Below takes Matti Vuorre’s code,… .
# Parse contributors JSON variable into its own table with preprint ids
contributors <- preprints |>
# Remove preprints with no contributor data and non-latest versions
filter(contributors != "[]", is_latest_version == 1) |>
# Select required variables only
select(id, contributors) |>
# Convert JSON into data frames in a list-column
mutate(
contributors = map(
contributors,
fromJSON
)
)
# Unnest into a table of contributors and clean
contributors <- contributors |>
unnest(contributors) |>
# Only include bibliographic authors
filter(bibliographic) |>
# Remove some other contributor variables and rename
select(id, name = full_name) |>
# Take out unnamed contributors
filter(name != "")
# Calculate total number of contributors
contributors_total <- nrow(contributors)
So for some bizarre reason I have an additional space in my name.
But hey it’s me!
my_coauthors <- contributors |>
# Retain all preprints where any of the authors was me
filter(any(name == "Thomas V. Pollet"), .by = id)
This bit adds the collaborators of my collabs.
my_coauthors_coauthors <- contributors |>
# Retain all preprints where any author was my coauthor
filter(any(name %in% unique(my_coauthors$name)), .by = id)
nrow(my_coauthors_coauthors)
## [1] 1371
# Get all pairs of co-authors for each paper and count collaborations
edges <- my_coauthors_coauthors |>
group_by(id) |>
# Create all pairwise combinations within each paper
reframe(expand.grid(
author1 = name,
author2 = name,
stringsAsFactors = FALSE
)) |>
# Remove self-loops and order pairs for undirected edges
filter(author1 < author2) |>
rename(from = author1, to = author2)
# Create graph with key metrics
graph <- edges |>
as_tbl_graph(directed = FALSE) |>
mutate(
distance = factor(node_distance_from(name == "Thomas V. Pollet"))
)
graph
## # A tbl_graph: 589 nodes and 3496 edges
## #
## # An undirected multigraph with 1 component
## #
## # Node Data: 589 × 2 (active)
## name distance
## <chr> <fct>
## 1 Connor Malcolm 1
## 2 Kris McCarty 1
## 3 Sam G. B. Roberts 1
## 4 Hans IJzerman 1
## 5 Jorick Post 2
## 6 Lison Neyroud 2
## 7 Michel Schrama 2
## 8 Rémi Courset 2
## 9 Kate Ratliff 2
## 10 Michelangelo Vianello 2
## # ℹ 579 more rows
## #
## # Edge Data: 3,496 × 3
## from to id
## <int> <int> <chr>
## 1 1 3 2583s_v1
## 2 2 3 2583s_v1
## 3 1 2 2583s_v1
## # ℹ 3,493 more rows
Prefer 666 over 999… .
set.seed(666)
graph |>
# Create a ggplot with appropriate mappings for graph data
ggraph(layout = "fr") +
# Show edges
geom_edge_link() +
# Show nodes
geom_node_point() +
# A blank theme
theme_graph()
Slightly tweaked below… . Different font and colours.
set.seed(666)
ggraph(graph, layout = "fr") +
# Make edges less prominent
geom_edge_link(
linewidth = 0.2,
alpha = 0.4,
color = "gray70"
) +
# Nodes further from me are smaller
geom_node_point(
aes(size = distance, color = distance)
) +
# Add text to my (bold) & coauthors' (plain) nodes
geom_node_text(
data = . %>% filter(distance != 2),
aes(
label = name,
fontface = ifelse(name == "Thomas V. Pollet", "bold", "plain")
),
repel = TRUE,
size = 2.25,
family = "Roboto"
) +
# Specify sizes, colors, and theme options
scale_size_manual(values = c(2, 1, 0.5)) +
scale_color_manual(
values = wes_palette("GrandBudapest1", 3, type = "continuous")
) +
theme_graph() +
theme(
legend.position = "none",
text = element_text(family = "Roboto")
)
## Warning: ggrepel: 42 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
Now export this via Cairo
and then convert .eps into
.pdf via Texshop or the like.
cairo_ps("Psy_Arxiv_network.eps", height= 10, width= 10, family = "Roboto")
set.seed(666)
ggraph(graph, layout = "fr") +
# Make edges less prominent
geom_edge_link(
linewidth = 0.2,
alpha = 0.4,
color = "gray70"
) +
# Nodes further from me are smaller
geom_node_point(
aes(size = distance, color = distance)
) +
# Add text to my (bold) & coauthors' (plain) nodes
geom_node_text(
data = . %>% filter(distance != 2),
aes(
label = name,
fontface = ifelse(name == "Thomas V. Pollet", "bold", "plain")
),
repel = TRUE,
size = 3.2,
family = "Roboto"
) +
# Specify sizes, colors, and theme options
scale_size_manual(values = c(2, 1, 0.5)) +
scale_color_manual(
values = wes_palette("GrandBudapest1", 3, type = "continuous")
) +
theme_graph() +
theme(
legend.position = "none",
text = element_text(family = "Roboto")
)
## Warning: ggrepel: 31 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
## quartz_off_screen
## 2
## R version 4.5.1 (2025-06-13)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.5
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: Europe/London
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] Cairo_1.6-5 ggraph_2.2.2 tidygraph_1.3.1
## [4] papaja_0.1.3 tinylabels_0.2.5 showtext_0.9-7
## [7] showtextdb_3.0 sysfonts_0.8.9 scales_1.4.0
## [10] RColorBrewer_1.1-3 ggsci_3.2.0 wesanderson_0.3.7
## [13] ggrepel_0.9.6 ggthemes_5.1.0 psyarxivr_0.0.0.9000
## [16] jsonlite_2.0.0 patchwork_1.3.2 lubridate_1.9.4
## [19] forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4
## [22] purrr_1.1.0 readr_2.1.5 tidyr_1.3.1
## [25] tibble_3.3.0 ggplot2_3.5.2 tidyverse_2.0.0
## [28] pak_0.9.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 viridisLite_0.4.2 farver_2.1.2 viridis_0.6.5
## [5] fastmap_1.2.0 TH.data_1.1-3 tweenr_2.0.3 bayestestR_0.16.1
## [9] digest_0.6.37 estimability_1.5.1 timechange_0.3.0 lifecycle_1.0.4
## [13] survival_3.8-3 magrittr_2.0.3 compiler_4.5.1 rlang_1.1.6
## [17] sass_0.4.10 tools_4.5.1 igraph_2.1.4 utf8_1.2.6
## [21] yaml_2.3.10 knitr_1.50 labeling_0.4.3 graphlayouts_1.2.2
## [25] curl_6.4.0 multcomp_1.4-28 withr_3.0.2 grid_4.5.1
## [29] polyclip_1.10-7 datawizard_1.2.0 xtable_1.8-4 emmeans_1.11.2
## [33] MASS_7.3-65 insight_1.4.0 cli_3.6.5 mvtnorm_1.3-3
## [37] rmarkdown_2.29 generics_0.1.4 rstudioapi_0.17.1 tzdb_0.5.0
## [41] parameters_0.28.0 cachem_1.1.0 ggforce_0.5.0 splines_4.5.1
## [45] effectsize_1.0.1 vctrs_0.6.5 Matrix_1.7-3 sandwich_3.1-1
## [49] hms_1.1.3 jquerylib_0.1.4 glue_1.8.0 codetools_0.2-20
## [53] stringi_1.8.7 gtable_0.3.6 pillar_1.11.0 htmltools_0.5.8.1
## [57] R6_2.6.1 evaluate_1.0.4 lattice_0.22-7 memoise_2.0.1
## [61] bslib_0.9.0 Rcpp_1.1.0 coda_0.19-4.1 gridExtra_2.3
## [65] xfun_0.52 zoo_1.8-14 pkgconfig_2.0.3
We are grateful to all authors.
r_refs(file = "r-analysis-references.bib")
my_citation <- cite_r(file = "r-analysis-references.bib")
R (Version 4.5.1; R Core Team 2025) and the R-packages Cairo (Version 1.6.5; Urbanek and Horner 2025), dplyr (Version 1.1.4; Wickham et al. 2023), forcats (Version 1.0.0; Wickham 2023a), ggplot2 (Version 3.5.2; Wickham 2016), ggraph (Version 2.2.2; Pedersen 2025a), ggrepel (Version 0.9.6; Slowikowski 2024), ggsci (Version 3.2.0; Xiao 2024), ggthemes (Version 5.1.0; Arnold 2024), jsonlite (Version 2.0.0; Ooms 2014), lubridate (Version 1.9.4; Grolemund and Wickham 2011), pak (Version 0.9.0; Csárdi and Hester 2025), papaja (Version 0.1.3; Aust and Barth 2024), patchwork (Version 1.3.2; Pedersen 2025b), psyarxivr (Version 0.0.0.9000; Vuorre 2025), purrr (Version 1.1.0; Wickham and Henry 2025), RColorBrewer (Version 1.1.3; Neuwirth 2022), readr (Version 2.1.5; Wickham, Hester, and Bryan 2024), scales (Version 1.4.0; Wickham, Pedersen, and Seidel 2025), showtext (Version 0.9.7; Qiu and See file AUTHORS for details. 2024b; Qiu and See file AUTHORS for details. 2020), showtextdb (Version 3.0; Qiu and See file AUTHORS for details. 2020), stringr (Version 1.5.1; Wickham 2023b), sysfonts (Version 0.8.9; Qiu and See file AUTHORS for details. 2024a), tibble (Version 3.3.0; Müller and Wickham 2025), tidygraph (Version 1.3.1; Pedersen 2024), tidyr (Version 1.3.1; Wickham, Vaughan, and Girlich 2024), tidyverse (Version 2.0.0; Wickham et al. 2019), tinylabels (Version 0.2.5; Barth 2025) and wesanderson (Version 0.3.7; Ram and Wickham 2023)