R at its simplest
| Edited | 2023-10-28 |
| Abstract | A very simple introduction to R, based on a comparison to calculators and worksheets. |
A selection of articles and code examples
Pedro J. Aphalo
2026-04-30
2026-04-30
This Contents page lists pages that I have written for course IPS-003 and for the R-peer-support meetings at the Viikki Campus of the University of Helsinki. The level of difficulty level varies from introductory to intermediate. I update these pages from time to time and I will add new pages from time to time.
R is a language and an environment for data analysis and visualization. It has become the standard for data analysis and visualisation in many fields. R can be extended by means of code packages which can be locally installed in a library. R can “talk” with code written in most other programming languages. This is important for code reuse and for performance in numerical and other computtaions.
Articles related to simple data manipulation, basic plotting, ANOVA and regression.
Moderately advanced R learning material is available as a free on-line course at intro2R.
The second edition of my book Learn R: As a Language will be published on 26 April 2024. A dedicated website provides additional information and some free extra chapters. My book does not assume previous programing and focuses on the R language itself rather than on using R for specific purposes.
Creating informative and elegant plots for inclusion in publications, reports and theses requires the same kind of approach than text. Design, drafting and frequently several rounds of revision. Plots are also very important for exploration and quality control of data. The requirements for exploration and publicationare rather different with respect to the graphical design, but not in relation to highlighting different features of a data set. Articles and code examples in this section, describe how to create specific types of plots or even how to add specific features to a plot. They assume familiarity with the basics of plotting in R with package ‘ggplot2’. The ‘ggplot2’ book is available on-line as an open-access web site.
| Edited | 2023-02-25 |
| Abstract | Example R code for plots with labels using position functions from package ggpp that combine the actions of two separate position functions available in package ggplot2, such as simultaneous use of stack and nudge and dodge and nudge. The examples show how to easily add labels to stacked and dodged bar plots retaining full control of the labels’ positioning. |
| Edited | 2023-07-16 |
| Abstract | Example R code for volcano plots and quadrant plots built with packages ggplot2 (>= 3.4.2), ggpp (>= 0.5.3), ggpmisc (>= 0.5.3) and ggrepel (>= 0.9.1). The examples demonstrate the use different types of annotations and data labels. My packages ggpp and ggpmisc include new geometries, statistics and scale functions specific and/or useful when plotting of gene-expression data in volcano and quadrant plots. |
| Edited | 2025-01-16 |
| Abstract | Example R code for plots based on package ggplot2 using geometries defined in package ggpp to add insets. These geometries from package ggpp implement addition of plot layers with plots, tables or other graphical objects as insets to a base plot, through extension of the Grammar of Graphics. |
| Edited | 2023-06-26 |
| Abstract | Example R code for interactive plots based on package ggplot2 using geometries defined in package ggpp and statistics from package ggpmisc together with package plotly. This page is a draft and currently contains a single plot example. |
Data analysis and design of experiments are very tightly dependent on each other. Statistics gives theoretical support to data analysis methods, but efficiently extracting information from observations from experiments and surveys is in many ways like detective work or solving puzzles. Modern data analysis makes heavy use of visual data displays (plots, diagrams, graphs). Material I have used for the course IPS-003 at the University of Helsinki, and for some earlier courses or individual classes is in the pages listed below, in part updated.
Interactive dashboards can help understand how the amount uncontrolled variation in observations and the number of replicates affect both tests of significance and parameter estimates when fitting models.
The interactive web page Design of Experiments: Playing with numbers is hosted at the ShinyApps server.
These pages are under a Creative Commons licence and can be reused with certain constraints, mainly that derived work must be licenced in the same way. The source files are available at GitHub.