top of page

#BeyondTheMean

  • Writer's pictureMatthew B. Courtney, Ed.D.

Creating Faceted Histogram with ggplot2

Updated: Jan 7, 2023



I have written on this page before about my not-so-secret love affair with the histogram. For educators seeking to visually display their student data to constituents who may not fully understand the underlying numbers, I believe that histograms are the best choice for quickly creating a visually appealing and meaningful visualization. Imagine my joy when I learned that R would allow me to make faceted histograms that layer multiple histograms together for analysis!


The faceted histogram is an amazing tool for comparing the performance of student groups, whether it is comparing boys to girls, looking across grade levels, or examining race/ethnicity subgroups. It places all the groups next to each other to allow you to easily look for changes and gaps in their performance. In this post, I am going to show you the code to make a facetted histogram with two variables and teach you how to tweak the histogram to make one that is meaningful to you.


Let’s start with the code.

library(ggplot2)
 
ggplot(data=data1, aes(x=Spring.Test.Score)) +
 geom_histogram(binwidth = 10) +
 facet_grid(Gender ~ Grade.Level) +
 ggtitle("Sample Grid") +
 xlab("Spring Test Scores") +
 ylab("Number of Students") +
 theme_bw()

This code generates this beautiful plot.


Like it? Okay – let’s talk about how to make it your own. We will go one line at a time.


The first line is essential – this is where we load the ggplot2 package. The ggplot2 package is an amazing tool for easily creating stunning visualization of your data. It can do so many things that it is simply overwhelming. We will stick to this one graph today. Use the library function to load ggplot2 to your workspace.


Next, we want to load up some data. We do that with this line of code.

ggplot(data=data1, aes(x=Spring.Test.Score)) +

This code has two features. First, the data= portion of this code tells R which piece of data we want to use to build our histogram. In this case, I have named my data sheet “data1” as I do in all of my R posts. The second part of this post is the aesthetic, the part that says aes(x=…). This is where you are going to tell R which column to use when building your plot. This is the column that houses the data that you want plotted. We will add in our other variables in a moment. In this case, I want to plot my “Spring.Test.Score” column.



The next line of code is where we tell R what kind of graph to make. We want a histogram so we are going to type geom_histogram(). If we stop there, it will build a histogram with all of the data in our column, unsorted. But we aren’t stopping there. Within the parentheses we can set our own bin width. Remember that a histogram sorts student scores into bins. The bin width is how big our bins are. We can leave that out and R will set a default bin, but I like to play around with different bin widths to see how it changes my outcome. In this case, we set the bins to 10.


Now we need to tell R how we want to facet our graph. This is where we list the other variables that we want to use. Here, I have chosen the Gender and Grade.Level columns of my data1 spreadsheet. You can choose any columns you want as long as those columns house discrete variables – aka variables that can be counted. R will automatically read those two columns and create your facets for you.



The next three rows allow you to customize your titles. If you choose not to customize your titles then they will auto-populate with whatever you called your columns in your spreadsheet. While that may be okay, I think it is always best to set titles that are clear and easy to understand for the layperson.


Finally, you get to make your graph pretty with some themes. I like the black and white theme, or theme_bw(), because I have always had irregular access to a reliable color printer in my school and government work stations. Some other themes you might like are theme_gray(), theme_light(), theme_dark(), theme_minimal(), theme_classic(), and theme_void(). Here is what they look like!


Okay, its time to wrap this one up. In this post I gave you the code to create a faceted histogram in R using ggplot2. This is my favorite way to create quick descriptive visualizations of student data. They look great in printed reports and projected on the conference room wall. Most importantly, they look IMPRESSIVE to the stakeholders you are sharing your data with. How are you going to use the faceted histogram to enhance your next data presentation? Tell me in the comments below!

Notice of AI Use: This website collaborates with ChatGPT, OpenAI's generative AI model, in developing some site content, while vigilantly protecting user privacy - Read More

bottom of page