10 Chart: Bar Chart
This section covers how to make bar charts
I want a nice example. Not tomorrow, not after breakfast. NOW!
Here’s a bar chart showing the survival rates of passengers aboard the RMS Titanic:
And here’s the code:
library(datasets) # data library(ggplot2) # plotting library(dplyr) # manipulation # Combine Children and Adult stats together ship_grouped <- as.data.frame(Titanic) %>% group_by(Class, Sex, Survived) %>% summarise(Total = sum(Freq)) ggplot(ship_grouped, aes(x = Survived, y = Total, fill = Sex)) + geom_bar(position = "dodge", stat = "identity") + geom_text(aes(label = Total), position = position_dodge(width = 0.9), vjust = -0.4, color = "grey68") + facet_wrap(~Class) + # formatting ylim(0, 750) + ggtitle("Don't Be A Crew Member On The Titanic", subtitle = "Survival Rates of Titanic Passengers by Class and Gender") + scale_fill_manual(values = c("#b2df8a", "#a6cee3")) + labs(y = "Passenger Count", caption = "Source: titanic::titanic_train") + theme(plot.title = element_text(face = "bold")) + theme(plot.subtitle = element_text(face = "bold", color = "grey35")) + theme(plot.caption = element_text(color = "grey68"))
For more info on this dataset, type
?datasets::Titanic into the console.
10.3 Simple examples
My eyes were bigger than my stomach. Much simpler please!
Let’s use the
HairEyeColor dataset. To start, we will just look at the different categories of hair color among females:
## # A tibble: 4 x 2 ## Hair Total ## <fct> <dbl> ## 1 Black 52 ## 2 Brown 143 ## 3 Red 37 ## 4 Blond 81
Now let’s make some graphs with this data.
10.3.1 Bar graph using base R
We recommend using Base R only for simple bar graphs for yourself. Like all of Base R, it is simple to setup. Note: Base R expects a vector or matrix, hence the double brackets in the barplot call (gets columns as lists).
10.3.2 Bar graph using ggplot2
Bar plots are very easy in
ggplot2. You pass in a dataframe and let it know which parts you want to map to different aesthetics. Note: In this case, we have a table of values and want to plot them as explicit bar heights. Because of this, we specify the y aesthetic as the
Total column, but we also have to specify
stat = "identity" in
geom_bar() so it knows to plot them correctly. Often you will have datasets where each row is one observation and you want to group them into bars. In that case, the y aesthetic and
stat = "identity" do not have to be specified.
- For more info about plotting categorical data, check out Chapter 4 of the textbook.
10.5 When to use
Bar Charts are best for categorical data. Often you will have a collection of factors that you want to split into different groups.
10.6.1 Not for continuous data
If you are finding that your bar graphs aren’t looking right, make sure your data is categorical and not continuous. If you want to plot continuous data using bars, that is what histograms are for!
These modifications assume you are using
10.7.1 Flip Bars
To flip the orientation, just tack on
10.7.2 Reorder the bars
With both base R and ggplot2 bars are drawn in alphabetical order for character data and in the order of factor levels for factor data. However, since the default order of levels for factor data is alphabetical, the bars will be alphabetical in both cases. Please see this tutorial for a detailed explanation on how bars should be ordered in a bar chart, and how the forcats package can help you accomplish the reordering.