Articles - R Graphics Essentials

To visualize a small data set containing multiple categorical (or qualitative) variables, you can create either a bar plot, a balloon plot or a mosaic plot.

For a large multivariate categorical data, you need specialized statistical techniques dedicated to categorical data analysis, such as simple and multiple correspondence analysis. These methods make it possible to analyze and visualize the association (i.e. correlation) between a large number of qualitative variables.

Here, you’ll learn some examples of graphs, in R programming language, for visualizing the frequency distribution of categorical variables contained in small contingency tables. We provide also the R code for computing the simple correspondence analysis.

Prerequisites

Load required R packages and set the default theme:

library(ggplot2) library(ggpubr) theme_set(theme_pubr())

Bar plots of contingency tables

Demo data set: HairEyeColor (distribution of hair and eye color and sex in 592 statistics students)

data("HairEyeColor") df 
## Hair Eye Sex Freq ## 1 Black Brown Male 32 ## 2 Brown Brown Male 53 ## 3 Red Brown Male 10 ## 4 Blond Brown Male 3 ## 5 Black Blue Male 11 ## 6 Brown Blue Male 50