Hopefully, most of you already have experiences in plotting basic R graphics. In this Chapter, you will be briefly introduced one of the most powerful plotting packages in R:
ggplot2 with it’s basic grammar and functions. To start, install
ggplot2 in the console or in R chunk.
For many R beginners, the question is always like: why is ggplot? One remarkable feature of
ggplot2 is having an underlying grammar which enables you to compose graphs by combining different components. You can easily create novel graphics by adding
ggplot2 functions to meet your needs based on your data.
By definition of the grammar of graphics, the most important features are data and mapping in the layers and that’s where we are getting started.
The most important part of all plots is data, which includes the information you want to visualize. Based on that, the next step is to decide its mapping, which determine how the data’s variable are mapped to aesthetic attributes on a graphic. Since data is independent from the other elements, you can always add several layers of data into the same ggplot while keeping the other components the same.
The following picture shows the order of ggplot functions:
For more function order suggestions and auto-correction when writing your own
ggplot2 functions, please refer to ggformat addin created by Joyce.
Geometric object, Statistical transformation and Position adjustment are components that can be customized in each layer.
geoms control the type of plot you create. Different types of plot have different aesthetics features. For example, a point geom has position, color, shape, and size aesthetics. You should first decide which kind of plot better explains the data before choosing
geoms and use
help function to check what aesthetics can be modified to achieve your desired effects.
A statistical transformation
stat transforms the data. And Position adjustment is applied when you need to adjust the position of elements on the plot for dense data, otherwise data points might obscure one another.
A scale controls how data is mapped to aesthetic attributes, so one scale for one layer.
A coordinate system
coord maps the position of objects onto the plane of the plot, and controls how the axes and grid lines are drawn. One ggplot can only have one
Faceting can be used to split the data up into subsets of the entire dataset.
Labels include titles, labels for x,y axis and annotates. Good graphics also need to give clear information by using labels to tell readers’ of the background knowledge of your data.
- For more implementations and examples, one easiest way is referring to the ggplot2 Cheatsheets provided by R. Follow the steps shown below and you can find the cheat-sheets in your RStudio.
The cheat-sheets clearly list the basic components of a ggplot where you can customize your unique plot by choosing different functions.
- If you are seeking for more detailed explanations and examples with real datasets, here are some useful links for you: