Fitted-model labels in Markdown

‘ggpmisc’ together with ‘ggtext’

Plotting examples
Author

Pedro J. Aphalo

Published

2024-06-01

Modified

2024-06-08

Abstract

Example R code for plots with predictions from fitted models labelled with model fitted equations, parameter estimates, R2, F-value, P-value, etc. Polynomial regression, quantile regression, major-axis regression, robust linear model fits as well as non-linear regression are examplified. Markdown encoded labels make use of statistics from package ‘ggpmisc’ and geometries fron ‘ggtext’.

Keywords

ggplot2 pkg, ggpp pkg, ggpmisc pkg, data labels, plot annotations, model equations

Tip

In this page code chunks are “folded” so as to decrease the clutter when searching for examples. Above each plot you will find one or more “folded” code chuncks signalled by a small triangle followed by “Code”. Clicking on the triangle “unfolds” the code chunk making visible the R code used to produce the plot.

The code in the chunks can be copied by clicking on the top right corner, where an icon appears when the mouse cursor hovers over the code listing.

The </> Code drop down menu to the right of the page title makes it possible to unfold all code chunks and to view the Quarto source of the whole web page.

Names of functions and other R objects are linked to the corresponding on-line help pages. The names of R extension packages are linked to their documentation web sites when available.

1 Introduction

Here you will find examples of ggplots with labels for fitted model equations, and various parameter estimates. The labels have been added to the plots with geometries defined in package ‘ggtext’ and statistics from package ‘ggpmisc’ both available at CRAN.

Currently, the labels are not always correctly rendered by marquee::geom_marquee(), which is at an early stage of development.

The code examples in this page are edited versions on those in a page with equivalent labels using R plotmath expressions.

Warning

The statistics from ‘ggpmisc’ use by default the geometries from ‘ggpp’. These geometries support like ggplot2::geom_text() and ggplot2::geom_label() character string "inward" for justification. We use in the examples in the page explicit arguments for justification and label positions as geom_richtext() does not seem to support these rather recent feature of geometries from ‘ggplot2’ also added to ‘ggpp’.

Important

One needs to always check that annotations do not occlude anything significant, such as observations in the base plot. This needs special care when using annotations together with batch plotting. Either ensure that the scale limits of the base plot are expanded to avoid overlap or that the layer with the equations is the lowest one, i.e., added to the plot first.

2 Preliminaries

The code used is shown on-demand above each plot and can be copied. We first load the packages we will use.

When package ggpmisc is loaded and attached, packages ggpp and ggplot2 are also attached. The only function from ggplot2 that is redefined by ggpp is annotate(), which remains backwards compatible with ggplot2.

3 Linear correlation

Code
# generate artificial data for a linear relationship
# two groups with different standard deviation around a slope = 1
set.seed(94321)
x <- (1:100) / 10
yA <- x + rnorm(length(x), sd = 2)
yB <- x + rnorm(length(x), sd = 8)
df1 <- data.frame(x = rep(x, 2),
                  y = c(yA, yB),
                  group = rep(c("A", "B"), rep(length(x), 2L)))

The examples in this section make use of stat_correlation() from package ggpmisc.

All the examples, except the one below, use the default border and fill of geom_richtext(), setting the label.size = 0 and fill = NA makes the output similar to that of geom_text().

Code
ggplot(subset(df1, group == "A"), 
       aes(x, y)) +
  geom_point() +
  stat_correlation(geom = "richtext", 
                   label.size = 0,
                   fill = NA,
                   hjust = 0, vjust = 1)

Pearson correlation.

Code
ggplot(subset(df1, group == "A"), 
       aes(x, y)) +
  geom_point() +
  stat_correlation(geom = "richtext", hjust = 0, vjust = 1)

Code
ggplot(subset(df1, group == "A"), 
       aes(x, y)) +
  geom_point() +
  stat_correlation(small.r = TRUE, geom = "richtext", 
                   hjust = 0, vjust = 1)

Spearman correlation. (The default is to not compute the confidence interval.)

Code
ggplot(subset(df1, group == "A"), 
       aes(x, y)) +
  geom_point() +
  stat_correlation(method = "spearman", geom = "richtext", 
                   hjust = 0, vjust = 1)

Kendall correlation. (The default is to not compute the confidence interval.)

Code
ggplot(subset(df1, group == "A"), 
       aes(x, y)) +
  geom_point() +
  stat_correlation(method = "kendall", geom = "richtext", 
                   hjust = 0, vjust = 1)

Select which labels to show in plot, number of digits to display.

Code
ggplot(subset(df1, group == "A"), aes(x, y)) +
  geom_point() +
  stat_correlation(mapping = use_label("R", "P", "n", sep = ", "),
                   r.digits = 3, p.digits = Inf, geom = "richtext",
                   hjust = 0, vjust = 1)

Select which labels to show in plot.

Code
ggplot(subset(df1, group == "A"), aes(x, y)) +
  geom_point() +
  stat_correlation(mapping = use_label("R", "t", "P", "n", sep = ", "), 
                   geom = "richtext", 
                   hjust = 0, vjust = 1)

Confidence interval for R (Pearson’s correlation).

Code
ggplot(subset(df1, group == "A"), aes(x, y)) +
  geom_point() +
  stat_correlation(mapping = use_label("R", "R.confint", sep = ", "), 
                   geom = "richtext",
                   hjust = 0, vjust = 1)

Confidence interval for Spearman’s correlation.

Code
ggplot(subset(df1, group == "A"), 
       aes(x, y)) +
  geom_point() +
  stat_correlation(mapping = use_label("R", "R.confint", sep = ", "),
                   r.conf.level = 0.95,
                   method = "spearman", geom = "richtext",
                   hjust = 0, vjust = 1)

Grouping supported.

Code
ggplot(df1, aes(x, y, color = group)) +
  geom_point() +
  stat_correlation(geom = "richtext",
                   vstep = 0.1,
                   hjust = 0, vjust = 1)

Facets supported.

Code
ggplot(df1, aes(x, y)) +
  geom_point() +
  stat_correlation(label.x = "right", label.y = "bottom", 
                   geom = "richtext",
                   hjust = 1, vjust = 0) +
  facet_wrap(~group)

Highlighting based on estimates, here estimated R but it is possible to use other estimates like P-value.

Code
ggplot(df1, aes(x, y)) +
  geom_point() +
  stat_correlation(geom = "richtext", 
                   hjust = 0, vjust = 1,
                   mapping = 
                     aes(color = ifelse(after_stat(cor) > 0.5,
                                        "red", "black"))) +
  scale_color_identity() +
  facet_wrap(~group)

3.1 Alternatives

Tip

Package ‘ggpubr’ provides alternative approaches to correlation testing and reporting.

4 OLS regression with polynomials

Polynomials can be fitted as linear models by the method of ordinary least squares (OLS), but many other linear models exist. Robust and resistant variations also exist, based on a very similar user interface.

The most common use case of a polynomial is linear regression. Polynomials are very frequently used, and in the case of linear regression the role of parameters can be easily recognized.

The examples in this section make use of stat_poly_line() and stat_poly_eq() from package ggpmisc. The model formula accepted follows the same format as accepted by lm(). However, automatic equation labels are generated only for subset of these.

Warning

As is the case in general for model fitting with R, function poly() can be used within model formulae, to describe polynomials. The default behaviour of poly() is to generate orthogonal polynomials by centring the data. This helps with computation, but produces coefficient estimates that are not directly interpretable. Thus, eq.label is set to NA unless raw = TRUE is passed in the call to poly() to disable the generation of orthogonal polynomials.

As predict() methods take into account whether the polynomial is orthogonal or raw, the plotted lines differ only in case of numerical accuracy problems. On the other hand the coefficient estimates are very different.

Tip

In statistics from ‘ggplot2’ and extension packages, model formulas are always defined using the name of aesthetics, x and y, not the names of the mapped variables. By default, equations added with stat_poly_eq() make use of x and y, but this can be altered by passing arguments in the call.

The model corresponding to linear regression, y ~ x, is a first degree polynomial, and it is the default of both statistics. The default method = "lm" fits it as a Linear Model (LM) by OLS.

Code
ggplot(df1, aes(x, y)) +
  geom_point() +
  stat_poly_line() +
  stat_poly_eq(use_label("eq"), 
               label.x = "right", label.y = "bottom", 
                   geom = "richtext",
                   hjust = 1, vjust = 0) +
  facet_wrap(~group)

Most likely the variables mapped onto aesthetics x and y have different names than the aesthetics. If we use a symbol to represent these variables, we will want to use these same symbols in the equation.

Code
ggplot(df1, aes(x, y)) +
  geom_point() +
  stat_poly_line() +
  stat_poly_eq(use_label("eq"), 
               eq.with.lhs = "_t_ = ", 
               eq.x.rhs = "_E_",
               label.x = "right", label.y = "bottom", 
               geom = "richtext",
               hjust = 1, vjust = 0) +
labs(x = expression("Irradiance, "*italic(E)~~(W~m^{-2})),
       y = expression("Temperature, "*italic(t)~~(degree*"C"))) +
  facet_wrap(~group)

Code
set.seed(4321)
# generate artificial data for a 3rd degree polynomial
x <- 1:100
y <- (x + x^2 + x^3) + rnorm(length(x), mean = 0, sd = mean(x^3) / 4)
y <- y / max(y)
df2 <- data.frame(x, 
                  y, 
                  group = c("A", "B"), 
                  y2 = y * c(1, 2) + c(0, 0.2),
                  block = c("a", "a", "b", "b"),
                  wt = sqrt(x))

\(R^2\) is the default label, as its estimate is always available, even when the model formula passed as argument is not that of a full polynomial.

Code
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y)) +
  geom_point() +
  stat_poly_line(formula = formula) +
  stat_poly_eq(formula = formula, geom = "richtext", hjust = 0, vjust = 1)

Fitted model equation, available for polynomials with no missing terms. Here we create the mapping with a call to use_label(), a convenience function that can be called instead of aes() for conciseness.

Code
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y)) +
  geom_point() +
  stat_poly_line(formula = formula) +
  stat_poly_eq(mapping = use_label("eq"), formula = formula, 
               geom = "richtext", hjust = 0, vjust = 1)

\(R_\mathrm{adj}^2\) and P-value added separately. This is necessary if separate locations or colours are desired for different labels derived from the same model fit.

Code
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y)) +
  geom_point() +
  stat_poly_line(formula = formula) +
  stat_poly_eq(mapping = use_label("adj.R2"), formula = formula, 
               geom = "richtext", hjust = 0, vjust = 1) +
  stat_poly_eq(mapping = use_label("P"), label.x = "right", label.y = "bottom", size = 3,
               formula = formula, 
               geom = "richtext", hjust = 1, vjust = 0)

Fitted model equation and \(R_\mathrm{adj}^2\). As seen above, function use_label() pastes together multiple labels. Here we pass an argument to sep.

Code
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y)) +
  geom_point() +
  stat_poly_line(formula = formula) +
  stat_poly_eq(mapping = use_label(c("eq", "adj.R2"), sep = "<br>_with_ "),
               formula = formula, geom = "richtext", hjust = 0, vjust = 1)

Equation with no left-hand-side (lhs).

Code
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y)) +
  geom_point() +
  stat_poly_line(formula = formula) +
  stat_poly_eq(mapping = use_label("eq"),
               eq.with.lhs = FALSE,
               formula = formula, geom = "richtext", hjust = 0, vjust = 1)

Variable names in the equation set to be informative, and match the axis labels.

Code
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y)) +
  geom_point() +
  stat_poly_line(formula = formula) +
  stat_poly_eq(mapping = use_label(c("eq", "R2", "P"), sep = ", "),
               eq.with.lhs = "_h_ = ",
               eq.x.rhs = "_z_",
               formula = formula, 
               geom = "richtext", hjust = 0, vjust = 1) +
  labs(x = expression(italic(z)), y = expression(italic(h)))

Even Greek charactes and maths can be incorporated when needed.

Code
formula <- y ~ poly(x, 2, raw = TRUE)
ggplot(df2, aes(x, log10(y + 1e6))) +
  geom_point() +
  stat_poly_line(formula = formula) +
  stat_poly_eq(mapping = use_label("eq"),
               eq.with.lhs = "log<sub>10</sub>_&delta;_ + 10<sup>6</sup> = ",
               eq.x.rhs = "_&Omega;_",
               formula = formula, 
               geom = "richtext", hjust = 0, vjust = 1) +
  labs(y = expression(plain(log)[10](italic(delta)+10^6)), 
       x = expression(Omega))

Caution

stat_poly_eq() by default assumes that the fitted polynomial has the variable mapped to the x or y aesthetic as explanatory variable depending on the model formula and argument passed to parameter orientation. It does not decode any transformations applied on the fly to x or y. In addition it does not generate eq.label for linear models that are not polynomials with no missing terms and terms entered in increasing order of their powers.

If a transformation is applied or any other deviation from a regular polynomial is passed to formula, it is still possible to use the labels for other fitted parameters. It is also possible to set output.type = "numeric" and manually build the equation label, using sprintf() as shown below, or a combination of paste() and format() or any other R function that can assemble a suitable character string from the numeric values returned.

When using this non-automated approach, any valid linear model can be fitted.

Code
formula <- y ~ I(x^2) # we apply a transformation to x within the formula
ggplot(df2, aes(x, y)) +
  geom_point() +
  stat_poly_line(formula = formula) +
  stat_poly_eq(mapping = aes(label = sprintf_dm("y = %.3g x<sup>2</sup>",
                                                after_stat(b_0))), 
               output.type = "numeric",
               formula = formula, 
               geom = "richtext", hjust = 0, vjust = 1)

In some cases, such as when applying a power as transformation and fitting higher degree polynomials, one needs to manually build the R expression representing the equation to simplify it.

The examples shown above also work with groups. As in ‘ggplot2’, models are fitted separately to the data in each group.

Code
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y2, colour = group)) +
  geom_point() +
  stat_poly_line(formula = formula) +
  stat_poly_eq(use_label("eq"), 
               formula = formula, 
               vstep = 0.1, 
               geom = "richtext", hjust = 0, vjust = 1) +
  theme_bw()

Using colours to indicate the groups to which equations correspond is not always best. Here we use textual labels pasted to the left of each equation.

Code
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y2, linetype = group, grp.label = group)) +
  geom_point() +
  stat_poly_line(formula = formula, color = "black") +
  stat_poly_eq(aes(label = after_stat(paste("**", grp.label, ":**  ", 
                                      eq.label, sep = ""))),
               formula = formula, geom = "richtext", 
               hjust = 0, vjust = 1, vstep = 0.1)

Facets are also supported.

Code
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y2)) +
  geom_point() +
  stat_poly_line(formula = formula) +
  stat_poly_eq(aes(label = after_stat(eq.label)),
               formula = formula, geom = "richtext", hjust = 0, vjust = 1) +
  facet_wrap(facets = vars(group), scales = "free_y", ncol = 1)

In the stats defined in ggpmsic the fit method function can modify the model passed as argument because the one used to build the equation is retrieved, when possible, from the fitted model object. In this example, a linear regression is fitted if the slope differs significantly from zero, but if not, the mean is reported, both graphically and numerically.

Code
# user defined fit method removes the slope if 
# the slope is not significant
poly_or_mean <- function(formula, data, ...) {
   fm <- lm(formula = formula, data = data, ...)
   if (anova(fm)[["Pr(>F)"]][1] > 0.1) {
      lm(formula = y ~ 1, data = data, ...)
   } else {
      fm
   }
}

# we create a plot as usual, but with our method
ggplot(mpg, aes(displ, hwy)) +
   geom_point() +
   stat_poly_line(method = "poly_or_mean") +
   stat_poly_eq(method = poly_or_mean,
                use_label("eq"),
                label.x = "right", 
                geom = "richtext", hjust = 1, vjust = 1) +
   theme(legend.position = "bottom") +
   facet_wrap(~class, ncol = 2)

Faceting works also with free x and/or y scales in panels.

Code
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y2, fill = block)) +
  geom_point(shape = 21, size = 3) +
  stat_poly_line(formula = formula) +
  stat_poly_eq(use_label("R2"), 
               size = 3,
               alpha = 0.33,
               formula = formula,
               geom = "richtext", 
               hjust = 0, vjust = 1, vstep = 0.1) +
  facet_wrap(~group, scales = "free_y") +
  theme(legend.position = "top")

Faceting works also with free x and/or y scales in panels when the position of labels is set by a string.

Code
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y2, fill = block)) +
  geom_point(shape = 21, size = 3) +
  stat_poly_line(formula = formula) +
  stat_poly_eq(use_label("R2"), size = 3,
               alpha = 0.33,
               formula = formula,
               label.x = "right", label.y = "bottom",
               rr.digits = 3,
               geom = "richtext", hjust = 1, vjust = 0, vstep = 0.1) +
  stat_poly_eq(use_label(c("F", "P")),
               size = 3,
               alpha = 0.33,
               formula = formula, 
               geom = "richtext", hjust = 0, vjust = 1, vstep = 0.1) +
  facet_wrap(~group, scales = "free_y") +
  theme(legend.position = "top")

Some artificial data with rather high random variation on both x and y.

Code
set.seed(94321)
a <- (1:100) / 10
x <- a + rnorm(length(x), sd = 1)
y <- a + rnorm(length(x), sd = 2)

df3 <- data.frame(x = x, y = y, y5 = y + 5)

Linear regression through the origin.

Code
# model with intercept = 0
formula <- y ~ x + 0
ggplot(df3, aes(x, y)) +
  geom_point() +
  stat_poly_line(formula = formula) +
  stat_poly_eq(formula = formula,
               mapping = use_label("eq"), 
               geom = "richtext", hjust = 0, vjust = 1)

Caution

These two examples show how to fit only one of the two parameters of a linear regression using the statistics from ‘ggpmisc’. These are unusual use cases but serve as examples of how simple transformations included in the model formula using the identity function I() can be “undone” at the after_stat() stage. Both examples use function stage(), available only in ‘ggplot2’ (>= 3.4.0).

Linear regression with a fixed (instead of fitted) y-intercept (y = 5 at x = 0) and a fitted slope.

Code
ggplot(df3) +
  stat_poly_line(mapping = aes(x = x, 
                               y = stage(start = y5, after_stat = y + 5),
                               ymax = after_stat(ymax + 5),
                               ymin = after_stat(ymin + 5)),
                 formula = I(y - 5) ~ x + 0) +
  stat_poly_eq(mapping = aes(x, y5, 
                             label = after_stat(paste(eq.label, " + 5, ", rr.label))), 
               orientation = "x", 
               formula = I(y - 5) ~ x + 0, 
               geom = "richtext", hjust = 0, vjust = 1) +
  geom_point(mapping = aes(x, y5), size = 2.5) +
  ylab("y")

Linear regression with a fixed slope of 1:1 (instead of fitted) and a fitted y-intercept.

Code
ggplot(df3) +
  stat_poly_line(mapping = aes(x = x,
                               y = stage(start = y5, after_stat = y + x - mean(x)),
                               ymax = after_stat(ymax + x - mean(x)),
                               ymin = after_stat(ymin + x - mean(x))),
                 formula = y ~ 1) +
  stat_poly_eq(mapping = aes(x, y5 - mean(x), 
                             label = sprintf("y = %.3g + _x_", 
                                             after_stat(b_0))), 
               parse = TRUE,
               output.type = "numeric",
               formula = y ~ 1, 
               geom = "richtext", hjust = 0, vjust = 1) +
  geom_point(mapping = aes(x, y5), size = 2.5) +
  ylab("y")

Regressions of y on x and x on y in the same plot. The lines are drawn for the range of the explanatory variable, which differs in this case between the two fits.

Code
# the default for formula is y ~ x
ggplot(df3, aes(x, y)) +
  geom_point() +
  stat_poly_line(color = "blue") +
  stat_poly_eq(mapping = use_label("R2", "eq", sep = ", "), 
               color = "blue", geom = "richtext", hjust = 0, vjust = 1) +
  stat_poly_line(color = "red", 
                 orientation = "y") +
  stat_poly_eq(mapping = use_label("R2", "eq", sep = ", "), 
               color = "red", 
               orientation = "y",
               label.y = 11, geom = "richtext", hjust = 0, vjust = 1)

4.1 Alternatives

Tip

Package ‘ggpubr’ provides a renamed copy of stat_poly_eq() taken from an old version of ggpmisc. The version in ‘ggpubr’ is much more limited in its functionality and even contains bugs. The line drawing code in stat_smooth() from ggplot2 was used as a basis for stat_poly_line(), the main difference is that by default it uses lm() as method irrespective of the data size while stat_smooth() switches to using splines for large numbers of observations.

5 Major axis regression

If x and y are both subject to random errors, and none is clearly the cause of the other, we should use major axis regression instead of linear regression.

The examples in this section make use of stat_ma_line() and stat_ma_eq() from package ggpmisc.

Code
ggplot(df3, aes(x, y)) +
  geom_point() +
  stat_ma_line() +
  stat_ma_eq(mapping = use_label("R2", "eq", sep = ", "), 
             geom = "richtext", hjust = 0)

Tip

No alternatives I know of.

6 Quantile regression

Quantile regression is frequently used is some disciplines like Economics but less frequently in other fields. It is robust and can be rather easily interpreted based on its similarity to the well known box plots.

The examples in this section make use of stat_quant_line(), stat_quant_band() and stat_quant_eq() from package ggpmisc.

Caution

Quantile regression fits are done by numerical approximation, and frequently generate warnings for non-unique solutions or similar problems. Their importance needs to be assessed by users. In the examples below they seem mostly unimportant given the large number of observations and have been silenced in the output.

A median regression with a band limited by the upper and lower quartile regressions.

Code
# the default for formula is y ~ x
ggplot(df3, aes(x, y)) +
  geom_point() +
  stat_quant_band() +
  stat_quant_eq(quantiles = 0.5, geom = "richtext", hjust = 0, vjust = 1)

Median regression through the origin.

Code
# model with intercept = 0
formula <- y ~ x + 0
ggplot(df3, aes(x, y)) +
  geom_point() +
  stat_quant_band(formula = formula) +
  stat_quant_eq(formula = formula, quantiles = 0.5, 
                geom = "richtext", hjust = 0, vjust = 1)

Quantile regressions of y on x and x on y.

Code
# the default for formula is y ~ x
ggplot(df3, aes(x, y)) +
  geom_point() +
  stat_quant_band(color = "blue") +
  stat_quant_eq(quantiles = 0.5, color = "blue", 
                geom = "richtext", hjust = 0, vjust = 1) +
  stat_quant_band(color = "red", 
                  orientation = "y") +
  stat_quant_eq(quantiles = 0.5, color = "red", 
                orientation = "y", label.y = 11, 
                geom = "richtext", hjust = 0, vjust = 1)

Fitting a polynomial by quantile regression.

Code
ggplot(df2, aes(x, y)) +
  geom_point() +
  stat_quant_band(formula = y ~ poly(x, 2, raw = TRUE)) +
  stat_quant_eq(formula = y ~ poly(x, 2, raw = TRUE), quantiles = 0.5,
                geom = "richtext", hjust = 0, vjust = 1)

Two quantiles are by default plotted as lines,

Code
ggplot(df2, aes(x, y)) +
  geom_point() +
  stat_quant_line(formula = y ~ poly(x, 2, raw = TRUE), quantiles = c(0.05, 0.95)) +
  stat_quant_eq(formula = y ~ poly(x, 2, raw = TRUE), quantiles = c(0.05, 0.95), 
                geom = "richtext", hjust = 0, vjust = 1, vstep = 0.1)

A single quantile, here the median, is plotted as a line plus a confidence band.

Code
ggplot(df2, aes(x, y)) +
  geom_point() +
  stat_quant_line(formula = y ~ poly(x, 2, raw = TRUE), quantiles = 0.5) +
  stat_quant_eq(formula = y ~ poly(x, 2, raw = TRUE), quantiles = 0.5, 
                geom = "richtext", hjust = 0, vjust = 1)

Equations labelled by quantile. (This code works only with ‘ggpmisc’ > 0.5.6, when qtl.label was added. Use grp.label with earlier versions.)

Code
ggplot(df2, aes(x, y)) +
  geom_point() +
  stat_quant_band(formula = y ~ poly(x, 2, raw = TRUE), 
                  color = "black", fill = "grey60") +
  stat_quant_eq(aes(label = paste(after_stat(qtl.label), ": ",
                                  after_stat(eq.label), sep = "")),
                formula = y ~ poly(x, 2, raw = TRUE), 
                geom = "richtext", hjust = 0, vjust = 1, vstep = 0.1) +
  theme_classic()

Equations labelled by group and quantile. (This code works only with ‘ggpmisc’ > 0.5.6, when qtl.label was added. Use only grp.label with earlier versions.)

Code
ggplot(df2, aes(x, y, group = group, linetype = group, 
                    shape = group, grp.label = group)) +
  geom_point() +
  stat_quant_line(formula = y ~ poly(x, 2, raw = TRUE), 
                  quantiles = c(0.1, 0.9), 
                  color = "black") +
  stat_quant_eq(aes(label = paste(after_stat(grp.label), " ",
                                  after_stat(qtl.label), ": ",
                                  after_stat(eq.label), sep = "")),
                formula = y ~ poly(x, 2, raw = TRUE), 
                quantiles = c(0.1, 0.9), 
                geom = "richtext", hjust = 0, vjust = 1, vstep = 0.1) +
  theme_classic()

6.1 Alternatives

Tip

Package ggplot2 defines stat_quantile() which can be used to plot as a line some types of quantile regression fits, but is less flexible than stat_quant_line() and stat_quant_band().

7 Non-linear models

The examples in previous sections were based on ready assembled labels. Here we show examples of how to generate labels for equations and other parameter estimates with an approach applicable to many different types of models.

The examples in this section make use of stat_fit_tidy() from package ggpmisc.

We use as example a fit of the Michaelis-Menthen equation of reaction kinetics, a function that is non-linear in its parameters.

# We use methods from package 'broom'
library(broom)
Code
micmen.formula <- y ~ SSmicmen(x, Vm, K) 
ggplot(Puromycin, aes(conc, rate, colour = state)) +
  geom_point() +
  geom_smooth(method = "nls", 
              formula = micmen.formula,
              se = FALSE) +
  stat_fit_tidy(method = "nls", 
                method.args = list(formula = micmen.formula),
                label.x = "right",
                label.y = "bottom",
                aes(label = paste("V<sub>m</sub> = ", signif(after_stat(Vm_estimate), digits = 3),
                                  "&plusmn;", signif(after_stat(Vm_se), digits = 2),
                                  "  K = ", signif(after_stat(K_estimate), digits = 3),
                                  "&plusmn;", signif(after_stat(K_se), digits = 2),
                                  sep = "")), 
                geom = "richtext", hjust = 1, vstep = 0.1) +
  theme_bw()

7.1 Alternatives

Tip

In all cases it is possible to do the model fitting before creating the plot, assembling labels in one’s own R code and passing them to ggplot() as data. The possibilities are nearly unlimited but implementing them, depending on one’s own familiarity with R and ‘ggplot2’ can become very time consuming. I have in part written ggpp and ggpmisc to avoid as a user having to remind myself of how to write such code repeatedly. My hope is that others will also save time and effort by using the packages I have developed.