Visualizing Distributions

Updated: Nov 29, 2020

Video Transcript



Hi! My name is Matthew Courtney. I am an education researcher and data consultant specializing in continuous improvement for schools. In this video, I am going to talk about how teachers can use visualizations to help them communicate about student performance.


The creation of visualizations is an important element to data analysis. Visualizations help make data more easily digestible and are sometimes more easily interpreted than tables filled with numbers. The shapes of visualizations can help you to spot trends in your data and they make reports more attractive to stakeholders. In this video, we are going to look at two visualizations used for univariate analysis. That means that these visualizations only look at one variable at a time. They are the boxplot and the histogram.


This is a boxplot of the average years of teaching experience in Kentucky public school districts. You can see that there are only numbers on the y axis. That is because this is a univariate plot – a plot with only one variable. A boxplot is made up of three parts. The first part is the box, and it tells you where the middle half of your scores lie. The bold line in the middle of the box is the median. The top and bottom of the box represent the upper and lower quartiles. Together they represent the interquartile range. The whiskers represent the maximum and minimum scores in the distribution. Finally, the dots represent outliers.


Let’s take a closer look here. The median score here is 12, meaning that the median years of experience in Kentucky’s public schools is 12. Half of the schools have a median years of experience above 12, and half below. The interquartile range is two, meaning that half of Kentucky’s schools have between 11 and 13 years of experience. There are probably a couple of relatively new schools here as well – we can tell that by the outliers way down at the bottom. You can see that this one visualization holds a lot of information about Kentucky schools.


Let’s move on and take a look at a histogram of the same data. Histograms work by placing scores into bins. The height of each bar represents the number of observations that fit into that bin. We can see on this histogram that the bin-width is set to one – I did that to make it easier to see. So the lowest bar here is at six years of experience and the highest is sixteen years of experience. You will notice that this is the same scale used on the boxplot we saw earlier. You can see quickly that few schools have below eight years of experience, with most schools having between 11 and 13 – the same as our box on the earlier visualization. The histogram shows us many of the same things but from a slightly different angle. For example, here it is easier to see that those outliers we saw earlier are really very few compared to the rest of the schools in the distribution.


Let’s take a look at an exemplar demonstrating how a teacher may use these visualizations. Mrs. Anderson has been asked to prepare a newsletter to parents about the sixth’s grades recent administration of the mid-year math proficiency assessment. She creates histogram and box plot to include in the newsletter along with a brief discussion of what the data means for students.



Here are Mrs. Anderson’s visualizations. It is clear from the scale on the y axis that this test ranged from zero points to one hundred points. The boxplot shows us that median score here is 70, with half of our students scoring above and half of our students scoring below. We have an interquartile range of thirty, with scores ranging from fifty to eighty percent. It also appears from our whiskers that we had at least one kid score 100 percent and one kid completely bombed the test. The histogram tells more of the story. We can see that more of our students scored on the high end of the spectrum than the low end. While we certainly have some students here who need some extra support and re-teaching, we have many students who are probably ready to move on to the next topic. Mrs. Anderson can easily and quickly explain these complex issues to her stakeholders by including these two visualizations in her newsletter.


Take some time to practice interpreting your own data now. Start by accessing some data from a recent assessment you have given. Feed the data into your favorite visualization tool and create a histogram and a boxplot. My free distribution analysis tools will do this for you with only one click. Visit www.matthewbcourtney.com to access that. Once you have your plots, consider the following questions. What do these visualizations tell you about your student performance? Do the visualizations look the way you expected them to? How would you explain these visualizations to a stakeholder?


For more information about how you can use data to enhance student learning, subscribe to my channel or visit www.matthewbcourtney.com.

2 views

Recent Posts

See All
Contact Me

Copyright 2020 by Courtney Consulting LLC

Terms and Conditions

Privacy Policy