library(poorman)
library(readxl)
<- 0.05
kFdr
<- local({
df <- paste0(
kUrl "https://drive.google.com/uc?export=download&",
"id=1xWVyoSSrs4hoqf5zVRgGhZPRjNY_fx_7"
) <- tempfile(fileext = ".xlsx")
temp_file download.file(kUrl, temp_file)
<- readxl::read_excel(temp_file, sheet = "mlx1 mutant LSD vs. HSD",
df skip = 3)
$direction <- with(df, poorman::case_when(
df> 0 & adj.P.Val < kFdr ~ "up",
logFC < 0 & adj.P.Val < kFdr ~ "down",
logFC TRUE ~ "n.s."
))$direction <- factor(df$direction, levels = c("down", "up", "n.s."))
dfreturn(df)
})
tl;dr
The ggiraph R package is my new favorite way to add interactivity to a ggplot.
Introduction
Last week I explored different ways to create collaborator-friendly volcano plots in R.
This week, a colleague asked me whether I could make it easier for them to identify which genes the points referred to. Luckily, there is no shortage of R packages to create interactive plots, including e.g. the plotly 1 or rbokeh packages. 2
Both plotly
and ggiraph
interface with the ggplot2
R package, allowing me to switch between interactive and non-interactive versions of my plots with ease.
First, let’s get some differential gene expression data, please see my previous post for details.
Let’s retrieve a table with the results from a differential gene expression analysis by downloading an excel file published as supplementary table S2 by Mattila et al, 2015
A non-interactive volcano plot
Next, we create a volcano plot using ggplot2
:
library(ggplot2)
::theme_set(theme_linedraw(base_size = 14))
ggplot2
<- ggplot(
p data = df,
mapping = aes(x = logFC, y = -log10(P.Value), color = direction)
+
) scale_color_manual(values = c("up" = "#E41A1C",
"down" = "#377EB8",
"n.s." = "lightgrey"),
guide = "none") +
geom_point(size = 2, alpha = 0.4) +
labs(
x = "Fold change (log2)",
y = "-log10(p-value)"
+
) theme(panel.grid = element_blank())
print(p)
An interactive volcano plot
Adding a tooltip for each point is as easy as replacing the geom_point()
call with its’ ggiraph::geom_point_interactive()
companion.
To see the result, please hover your mouse over a point in the plot below:
library(ggiraph)
<- ggplot(
p data = df,
mapping = aes(x = logFC, y = -log10(P.Value), color = direction)
+
) scale_color_manual(values = c("up" = "#E41A1C",
"down" = "#377EB8",
"n.s." = "lightgrey"),
guide = "none") +
1::geom_point_interactive(
ggiraphaes(
tooltip = sprintf("%s\nlogFC: %s\nFDR: %s",
Symbol, signif(logFC, digits = 3),
signif(adj.P.Val, digits = 2)
)
),hover_nearest = TRUE,
size = 3,
alpha = 0.4) +
labs(
x = "Fold change (log2)",
y = "-log10(p-value)"
+
) theme(panel.grid = element_blank())
2::girafe(ggobj = p,
ggiraphoptions = list(
opts_tooltip(use_fill = TRUE),
opts_zoom(min = 0.5, max = 5),
3opts_sizing(rescale = FALSE),
opts_toolbar(saveaspng = TRUE, delay_mouseout = 2000)
) )
- 1
-
geom_point_interactive()
understands thetooltip
aesthetic, so we can display the gene symbol, the log2 fold change and the FDR for each gene. - 2
-
The
ggiraph::girafe()
function turns ourggplot
object into an interactive graph, and its arguments define additional properties, e.g. the contents of the context menu, or the style of the tool tip information. - 3
-
By default,
ggiraph
plots rescales to the size of the html container. To suppress this behavior, we setrescale = FALSE
and rely on thefig-width
andfig-height
defined in this quarto markdown document instead.
Combining ggrastr and ggiraph
Adding interactivity to the plot increases the size of the html page it is contained in. In case that’s a concern, e.g. when there are many plots on the same page, we can restrict the tool tips to a subset of the points, e.g. only those that pass our significance threshold.
We can also combine ggiraph
with the ggrastr
package, first plotting all points as a rasterized image (which does not encode the position of each point separately) - and then overlay transparent interactive points for the significant genes.
1library(ggrastr)
::theme_set(theme_linedraw(base_size = 14))
ggplot2
<- ggplot(
p data = df,
mapping = aes(x = logFC, y = -log10(P.Value), color = direction)
+
) scale_color_manual(values = c("up" = "#E41A1C",
"down" = "#377EB8",
"n.s." = "lightgrey"),
guide = "none") +
2::geom_point_rast(size = 2, alpha = 0.4) +
ggrastr::geom_point_interactive(
ggiraph3data = poorman::filter(df, direction != "n.s."),
aes(
tooltip = sprintf("%s\nlogFC: %s\nFDR: %s",
Symbol, signif(logFC, digits = 3),
signif(adj.P.Val, digits = 2)
)
),hover_nearest = TRUE,
size = 3,
4alpha = 0) +
labs(
x = "Fold change (log2)",
y = "-log10(p-value)"
+
) theme(panel.grid = element_blank())
::girafe(ggobj = p,
ggiraphoptions = list(
opts_tooltip(use_fill = TRUE),
opts_zoom(min = 0.5, max = 5),
opts_sizing(rescale = FALSE),
opts_toolbar(saveaspng = TRUE, delay_mouseout = 2000)
) )
- 1
-
The ggrastr R package offers drop-in replacements for
ggplot2
functions that help reduce the size (and complexity) of graphics. - 2
- We add a rasterized layer with all points.
- 3
-
Subsetting the data.frame passed as the
data
argument restricts interactivity to only the significant genes. - 4
-
Because the points are already drawn by the
ggrastr::geom_point_rast
function, we setalpha = 0
to obtain transparent points that will trigger the display of the tool tip.
The ggiraph
package comes with excellent documentation - check it out!
Reproducibility
Session Information
::session_info("attached") sessioninfo
─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 4.3.2 (2023-10-31)
os Debian GNU/Linux 12 (bookworm)
system x86_64, linux-gnu
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/Los_Angeles
date 2024-04-13
pandoc 3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
─ Packages ───────────────────────────────────────────────────────────────────
! package * version date (UTC) lib source
P ggiraph * 0.8.9 2024-02-24 [?] RSPM
P ggplot2 * 3.5.0 2024-02-23 [?] RSPM
P ggrastr * 1.0.2 2023-06-01 [?] CRAN (R 4.3.1)
P poorman * 0.2.7 2023-10-30 [?] RSPM
P readxl * 1.4.3 2023-07-06 [?] CRAN (R 4.3.1)
[1] /home/sandmann/repositories/blog/renv/library/R-4.3/x86_64-pc-linux-gnu
[2] /home/sandmann/.cache/R/renv/sandbox/R-4.3/x86_64-pc-linux-gnu/9a444a72
P ── Loaded and on-disk path mismatch.
──────────────────────────────────────────────────────────────────────────────
This work is licensed under a Creative Commons Attribution 4.0 International License.
Footnotes
My previous favorite - also available for other languages including python.↩︎
See Robert Kabacoff’s “Modern Data Visualization with R” online book for some great examples.↩︎