5  Pictures and Numbers

Author
Affiliation

Dr. Devan Becker

Wilfrid Laurier University

Published

2024-10-09

5.1 Categorical Variables

For the data in class, we are given the counts:

In the code above, las = 1 simply makes the axis names vertical so that they’re easy to read. R has an annoying habit of just not showing the names if they overlap.

We often have the raw data:

I will not be demonstrating pie charts, as they should almost never be used1.

5.2 Histograms

Play around with the “breaks” parameter. Notice how R only takes it as a suggestion, trying to make sure the breaks are pretty.

Below is an example where we tell R to make a sequence of breaks, from 100, to 800, with 100 units between each break.

5.3 Mean and Standard Deviation

5.4 Five-Number Summary and Box Plots

The Five-Number Summary is calculated by the summary() function (which also returns the mean and the number of missing values).

We can calculate these manually with the quantile() function. Note that we need na.rm = TRUE to remove the NA values2.

The boxplot can be made as expected. It can take two vectors, and will put them side-by-side for easy comparison.


  1. I recognize that there are specific situations where they should be used. You need to know a lot about data vizualization to know when it’s appropriate, and thus you’ll probably already know how to make a pie chart. I will not teach you how to make a pie chart.↩︎

  2. The median of 1, 2, and some third number that I’m not telling you about should be NA, since you can’t calculate it!↩︎