Data Representations
Part II

Dr. Mine Dogucu

Data Visualization

In the last lecture we have made basic data visualizations. In this lecture we will improve them to make them more accessible.

Labeling Axes

ggplot(penguins,
       aes(x = bill_depth_mm,
           y = bill_length_mm,
           color = species)) +
  geom_point() +
  labs(x = "Bill Depth (mm)", 
       y = "Bill Length (mm)",
       color = "Species",
       title = "Palmer Penguins") 

We can change axes and plot labels using the labs() function.

Themes

ggplot(penguins,
       aes(x = bill_depth_mm,
           y = bill_length_mm,
           color = species)) +
  geom_point() +
  labs(x = "Bill Depth (mm)", 
       y = "Bill Length (mm)", 
       title = "Palmer Penguins") +
  theme_gray()

Theme gray is the default theme in ggplot.

ggplot(penguins,
       aes(x = bill_depth_mm,
           y = bill_length_mm,
           color = species)) +
  geom_point() +
  labs(x = "Bill Depth (mm)", 
       y = "Bill Length (mm)", 
       title = "Palmer Penguins") +
  theme_bw()

ggplot(penguins,
       aes(x = bill_depth_mm,
           y = bill_length_mm,
           color = species)) +
  geom_point() +
  labs(x = "Bill Depth (mm)", 
       y = "Bill Length (mm)", 
       title = "Palmer Penguins") +
  theme_dark()

ggplot(penguins,
       aes(x = bill_depth_mm,
           y = bill_length_mm,
           color = species)) +
  geom_point() +
  labs(x = "Bill Depth (mm)", 
       y = "Bill Length (mm)", 
       title = "Palmer Penguins") +
  theme_classic()

ggplot(penguins,
       aes(x = bill_depth_mm,
           y = bill_length_mm,
           color = species)) +
  geom_point() +
  labs(x = "Bill Depth (mm)", 
       y = "Bill Length (mm)", 
       title = "Palmer Penguins") +
  theme_minimal()

Font Size

ggplot(penguins,
       aes(x = bill_depth_mm,
           y = bill_length_mm,
           color = species)) +
  geom_point() +
  labs(x = "Bill Depth (mm)", 
       y = "Bill Length (mm)", 
       title = "Palmer Penguins") +
  theme(text = element_text(size=20))

The theme() function allows for many components of a theme. By typing ?theme in the Console, you can read the documentation of the function to see what components can be modified.

Font Size

One can also set the default font size of theme. For instance, if you utilize the following code at the first chunk of a Quarto document, all plots will be in gray theme and will have a font size of 22.

theme_set(theme_gray(base_size = 22))

Using Shapes in Addition to Colors

ggplot(penguins,
       aes(x = bill_depth_mm,
           y = bill_length_mm,
           shape = species,
           color = species)) +
  geom_point(size = 4) 

Previously species were only distinguishable to someone who could distinguish these colors. By using shapes, color-blind viewers can also distinguish the species.

Color-Blindness Simulation

species_bills <- 
  ggplot(penguins,
         aes(x = bill_depth_mm,
             y = bill_length_mm,
             color = species)) +
  geom_point(size = 4) 

By storing the plot as an object named species_bills, we will be able to use it in other functions.

Color-Blindness Simulation

colorblindr::cvd_grid(species_bills) 

The cvd_grid() function from the colorblindr() package creates a grid of different color-deficiency simulations.

Deuteranomaly is reduced sensitivity to green light Protanomaly, is reduced sensitivity to red light Tritanomaly is reduced sensitivity to blue light Desaturated is no color difference

Color-Blindness Simulation

Okabe-Ito Color Palette

In 2008, Masataka Okabe and Kei Ito proposed a color palette that is accessible to people with various color deficiencies. We use their last names referring to the color palette.

palette.colors(palette = "Okabe-Ito")
[1] "#000000" "#E69F00" "#56B4E9" "#009E73" "#F0E442" "#0072B2" "#D55E00"
[8] "#CC79A7" "#999999"

Okabe-Ito Color Palette

The codes displayed with a hashtag are called hex color code. You can use hex codes in R (and in HTML) to specify colors.

Okabe-Ito Color Palette

species_bills + 
  scale_color_manual(values = c("Adelie" = "#E69F00", "Chinstrap" = "#56B4E9", "Gentoo" = "#009E73"))

Okabe-Ito Color Palette

species_bills + 
  colorblindr::scale_color_OkabeIto()

Improving Data Visualizations

Top Ten Dos and Don’t for Charts and Graphs

We cannot learn every single ggplot feature in a single week but a little but an internet search and a few examples can go a long way in improving your visualizations.

Data Verbalization

Assistive Technology

Assistive Technology is any form of technology (software, device) that helps people with disabilities perform certain activities.

Examples:

  • walking sticks
  • wheel chairs
  • screen readers

Screen reader

A screen reader is an assistive technology that supports blind or visually impaired people in using their computer.

The video shows use of a screen reader briefly.

Alternate Text

  • “Alt text” describes contents of an image.
  • It is used in HTML pages.
  • Screen-readers cannot read images but can read alt text.
  • Alt text has to be provided.

Manual Alternate Text Guidelines

  • Chart type

  • Type of data

  • Reason for including the chart

  • Link to data or source (not in alt text but in main text)

Cesal, 2020

  • Description conveys meaning in the data

  • Variables included on the axes

  • Scale described within the description

  • Type of plot is described

Canelón & Hare, 2021

Data Verbalization - Automated

The automated alternate text for this plot is on the next slide.

Data Verbalization - Automated

BrailleR::VI(example_barplot)
This is an untitled chart with no subtitle or caption.
It has x-axis 'transmission' with labels automatic and manual.
It has y-axis 'count' with labels 0, 5, 10 and 15.
The chart is a bar chart with 2 vertical bars.
Bar 1 is centered at 1, and length is from 0 to 19.
Bar 2 is centered at 2, and length is from 0 to 13.

Drawback of using VI() is that not every type of plot can be verbalized yet.

Alt Text in Quarto

```{r}
#| fig-align: center
#| fig-cap: Relationship between bill depth (mm) and length (mm) for different species of penguins
#| fig-alt: The scatterplot shows bill depth in mm on the x-axis and bill length in mm on the y-axis with points differently colored for different species as Adelie, Chinstrap, and Gentoo. The x axis ranges from about 12.5 mm to 22.5 mm. The y-axis ranges from about 30 to 60 mm. For all species the relationship seems moderately positive. When comparing the three species, Adelie penguins seem to have longer bill depth but shorter bill length. Chinstraps have longer bill depth and longer bill length. Gentoo penguins have shorter bill depth and longer bill length.
ggplot(penguins, aes(x = bill_depth_mm,
                     y = bill_length_mm,
                     color = species)) +
  geom_point(size = 4) 
```
The scatterplot shows bill depth in mm on the x-axis and bill length in mm on the y-axis with points differently colored for different species as Adelie, Chinstrap, and Gentoo. The x axis ranges from about 12.5 mm to 22.5 mm. The y-axis ranges from about 30 to 60 mm. For all species the relationship seems moderately positive. When comparing the three species, Adelie penguins seem to have longer bill depth but shorter bill length. Chinstraps have longer bill depth and longer bill length. Gentoo penguins have shorter bill depth and longer bill length.

Relationship between bill depth (mm) and length (mm) for different species of penguins

Caption vs. Alt Text

Figure captions (fig-cap) appear on the front-end of a document and is accessible to all whether they are reading it directly or via screen readers.

Figure alternate text (fig-alt) only appears on the back-end of a document and is accessible to screen readers and those who know how to investigate the source code of a (HTML) document.

Even though, we are using captions and alternate text in Quarto, these are available features in many other software (e.g., Google doc, PowerPoint etc.)

Data Sonification

Data Sonification

Data sonification is the presentation of data as sound.

x <- 1:10
y <- 1:10
plot(x, y)
sonify::sonify(x,y)

Data Tactualization

Data tactulization refers to making data visualization in a form so that it can be touchable. The video shows printing of a tactile boxplot.

Data Tactualization

library(tactileR)
brl_begin(file = 'tactile.pdf', 
          pt = 11, 
          paper = 'special', font='BRL')
hist(penguins$bill_length_mm)
brl_end()

A histogram with x and y labels, title displayed in Braille.

Learning Tip of The Day

Cell phone usage and academic performance: An experiment