Today’s post is a response to this video on a data science technology called EMBL, presented by Hadley Wickham, the creator of ggplot and ggplot2, linked here and included below. The video is about an hour long. At this point, it is a few years old, but the presentation touches on some great fundamentals about data visualization.
He opens the video by talking about the conception of the tidyverse, including ggplot2, tibble, dplyr, and shiny, then introducing the audience to the famous animated visualization by Hans Rosling of countries and life expectancy.
He touches on ggplot2, and explains that he was inspired to create it after reading The Grammar of Graphics by Leland Wilkinson. The gg in ggplot2 stands for “grammar of graphics.” The way ggplot2 functions is by adding elements of a plot in layers, building an image from the ground up in an understandable way, instead of simply dictating a plot from the top down by typing the type of plot into the program.
But ggplot2 isn’t perfect. The standard plots it produces often use scientific notation that most people have trouble interpreting and the automatic color choices are often not colorblind friendly.
A nice thing he said is that “the only way to get a really good visualization is to produce a bunch of not very good visualizations.” A nice comfort when I compare my novice nonsense to the works of those giants whom I now stand upon.
Overall, I appreciated this insight into how ggplot2 works and how all the different packages of the tidyverse really fit together!