Interactive volcano plots with the ggiraph R package

R
ggplot2
TIL
Author

Thomas Sandmann

Published

April 11, 2024

tl;dr

The ggiraph R package is my new favorite way to add interactivity to a ggplot.

Introduction

Last week I explored different ways to create collaborator-friendly volcano plots in R.

This week, a colleague asked me whether I could make it easier for them to identify which genes the points referred to. Luckily, there is no shortage of R packages to create interactive plots, including e.g. the plotly 1 or rbokeh packages. 2

Both plotly and ggiraph interface with the ggplot2 R package, allowing me to switch between interactive and non-interactive versions of my plots with ease.

First, let’s get some differential gene expression data, please see my previous post for details.

Let’s retrieve a table with the results from a differential gene expression analysis by downloading an excel file published as supplementary table S2 by Mattila et al, 2015

library(poorman)
library(readxl)
kFdr <- 0.05

df <- local({
  kUrl <- paste0(
    "https://drive.google.com/uc?export=download&",
    "id=1xWVyoSSrs4hoqf5zVRgGhZPRjNY_fx_7"
  )  
  temp_file <- tempfile(fileext = ".xlsx")
  download.file(kUrl, temp_file)
  df <- readxl::read_excel(temp_file, sheet = "mlx1 mutant LSD vs. HSD",
                           skip = 3)
  
  df$direction <- with(df, poorman::case_when(
    logFC > 0 & adj.P.Val < kFdr ~ "up",
    logFC < 0 & adj.P.Val < kFdr ~ "down",
    TRUE ~ "n.s."
  ))
  df$direction <- factor(df$direction, levels = c("down", "up", "n.s."))
  return(df)
})

A non-interactive volcano plot

Next, we create a volcano plot using ggplot2:

library(ggplot2)
ggplot2::theme_set(theme_linedraw(base_size = 14))

p <- ggplot(
  data = df, 
  mapping = aes(x = logFC, y = -log10(P.Value), color = direction)
) +
  scale_color_manual(values = c("up" = "#E41A1C",
                                "down" = "#377EB8", 
                                "n.s." = "lightgrey"),
                     guide = "none") +
  geom_point(size = 2, alpha = 0.4) +
  labs(
    x = "Fold change (log2)",
    y = "-log10(p-value)"
  ) +
  theme(panel.grid = element_blank())
print(p)

An interactive volcano plot

Adding a tooltip for each point is as easy as replacing the geom_point() call with its’ ggiraph::geom_point_interactive() companion.

To see the result, please hover your mouse over a point in the plot below:

library(ggiraph)

p <- ggplot(
  data = df, 
  mapping = aes(x = logFC, y = -log10(P.Value), color = direction)
) +
  scale_color_manual(values = c("up" = "#E41A1C",
                                "down" = "#377EB8", 
                                "n.s." = "lightgrey"),
                     guide = "none") +
1  ggiraph::geom_point_interactive(
    aes(
      tooltip = sprintf("%s\nlogFC: %s\nFDR: %s", 
                        Symbol, 
                        signif(logFC, digits = 3),
                        signif(adj.P.Val, digits = 2)
                        )
    ),
    hover_nearest = TRUE, 
    size = 3,
    alpha = 0.4) +
  labs(
    x = "Fold change (log2)",
    y = "-log10(p-value)"
  ) +
  theme(panel.grid = element_blank())

2ggiraph::girafe(ggobj = p,
                options = list(
                  opts_tooltip(use_fill = TRUE),
                  opts_zoom(min = 0.5, max = 5),
3                  opts_sizing(rescale = FALSE),
                  opts_toolbar(saveaspng = TRUE, delay_mouseout = 2000)
                )
)
1
geom_point_interactive() understands the tooltip aesthetic, so we can display the gene symbol, the log2 fold change and the FDR for each gene.
2
The ggiraph::girafe() function turns our ggplot object into an interactive graph, and its arguments define additional properties, e.g. the contents of the context menu, or the style of the tool tip information.
3
By default, ggiraph plots rescales to the size of the html container. To suppress this behavior, we set rescale = FALSE and rely on the fig-width and fig-height defined in this quarto markdown document instead.

Combining ggrastr and ggiraph

Adding interactivity to the plot increases the size of the html page it is contained in. In case that’s a concern, e.g. when there are many plots on the same page, we can restrict the tool tips to a subset of the points, e.g. only those that pass our significance threshold.

We can also combine ggiraph with the ggrastr package, first plotting all points as a rasterized image (which does not encode the position of each point separately) - and then overlay transparent interactive points for the significant genes.

1library(ggrastr)
ggplot2::theme_set(theme_linedraw(base_size = 14))

p <- ggplot(
  data = df, 
  mapping = aes(x = logFC, y = -log10(P.Value), color = direction)
) +
  scale_color_manual(values = c("up" = "#E41A1C",
                                "down" = "#377EB8", 
                                "n.s." = "lightgrey"),
                     guide = "none") +
2  ggrastr::geom_point_rast(size = 2, alpha = 0.4) +
  ggiraph::geom_point_interactive(
3    data = poorman::filter(df, direction != "n.s."),
    aes(
      tooltip = sprintf("%s\nlogFC: %s\nFDR: %s", 
                        Symbol, 
                        signif(logFC, digits = 3),
                        signif(adj.P.Val, digits = 2)
                        )
    ),
    hover_nearest = TRUE, 
    size = 3,
4    alpha = 0) +
  labs(
    x = "Fold change (log2)",
    y = "-log10(p-value)"
  ) +
  theme(panel.grid = element_blank())

ggiraph::girafe(ggobj = p,
                options = list(
                  opts_tooltip(use_fill = TRUE),
                  opts_zoom(min = 0.5, max = 5),
                  opts_sizing(rescale = FALSE),
                  opts_toolbar(saveaspng = TRUE, delay_mouseout = 2000)
                )
)
1
The ggrastr R package offers drop-in replacements for ggplot2 functions that help reduce the size (and complexity) of graphics.
2
We add a rasterized layer with all points.
3
Subsetting the data.frame passed as the data argument restricts interactivity to only the significant genes.
4
Because the points are already drawn by the ggrastr::geom_point_rast function, we set alpha = 0 to obtain transparent points that will trigger the display of the tool tip.

The ggiraph package comes with excellent documentation - check it out!

Reproducibility

Session Information
sessioninfo::session_info("attached")
─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31)
 os       Debian GNU/Linux 12 (bookworm)
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/Los_Angeles
 date     2024-04-13
 pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 ! package * version date (UTC) lib source
 P ggiraph * 0.8.9   2024-02-24 [?] RSPM
 P ggplot2 * 3.5.0   2024-02-23 [?] RSPM
 P ggrastr * 1.0.2   2023-06-01 [?] CRAN (R 4.3.1)
 P poorman * 0.2.7   2023-10-30 [?] RSPM
 P readxl  * 1.4.3   2023-07-06 [?] CRAN (R 4.3.1)

 [1] /home/sandmann/repositories/blog/renv/library/R-4.3/x86_64-pc-linux-gnu
 [2] /home/sandmann/.cache/R/renv/sandbox/R-4.3/x86_64-pc-linux-gnu/9a444a72

 P ── Loaded and on-disk path mismatch.

──────────────────────────────────────────────────────────────────────────────

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Footnotes

  1. My previous favorite - also available for other languages including python.↩︎

  2. See Robert Kabacoff’s “Modern Data Visualization with R” online book for some great examples.↩︎