#BeyondTheMean

Descriptive Statistics in Education

Updated: Jul 16


Welcome to #BeyondTheMean! Check out this post to see what this blog is all about.


Educators have access to a lot of data. Unfortunately, a list of test scores doesn’t do you much good when you are trying to understand a continuous improvement problem or make decision about a classroom, school, or system. In order to make use of that data, education decision makers must summarize and describe it using descriptive statistics. In this post, I want to discuss the six descriptive statistics used in education and talk a little about how you might use them to inform your decision making.


Mean

The mean is the statistical average of a range of scores. You calculate the mean by adding up all the individual scores and then dividing the sum by total number of all scores. It is a figure that helps you understand where your students performed as a group. The mean, along with the median and mode, is a measure of central tendency, meaning that it summarizes a range of scores around its most central point.


The mean is a valuable statistic to calculate and is often the first statistic calculated when summarizing your data. It is easily understood by most people and easily compared to the mean of previous occurrences. It is also a pre-requisite to many more rigorous statistical calculations deployed by analysis seeking to make inferences about a set of data.


One downside to the mean is that it is heavily swayed by outliers. If you have a few students who scored far above or far below the rest of the students, the mean will be skewed in that direction. Remember teasing the “curve busters” in your classes in high school? Those students “broke the curve” because they scored so far above the rest of us that the average on the test was higher than it should have been – discouraging our teachers from granting bonus points across the board. This is why I always say that we must look #beyondthemean when analyzing our data.


Median

One way that we can look #beyondthemean, is by examining the median. The median is the score in the exact middle when you line them all up. If you have an even number of scores, then the median is the average of the two scores in the middle. You can only calculate the median if you have ordinal data – that is data that can be placed in a logical order, such as a test score.


In a perfectly distributed set of scores, the mean and median should be the same because they will both be in the exact middle of the distribution. Unfortunately, a perfectly distributed set of scores is a freak of nature that almost never happens. As such, the relationship between the mean and median is important because it can give you a feel for the skew of your data.


The skew describes which way your data leans. If the median is higher than the mean, then the data is said to “skew to the left” or to have a negative skew. That means that the outliers in your data are on the lower end of the spectrum. When your median is lower than the mean, the opposite is true. This data “skews to the right”, or has a positive skew, meaning that the outliers in your data are on the upper end of the spectrum.



Mode

The mode is simply the number, or numbers, that occur most frequently in a distribution of scores. The mode is best applied to categorical data and can be used to easily explain which category has the highest number of inputs. A distribution of scores may be unimodal, containing only one mode, bi-modal, containing two modes, or mulit-modal, containing more than two modes. By examining the mode, you can quickly spot clusters in your data that may contain useful information.


Range

The range is the difference between the highest and lowest scores in your distribution. Along with the standard deviation and quartiles, it is considered a measure of dispersion. Measures of dispersion help you to see how spread out your scores are – or how side the gap is between your highest performing student and your lowest performing student.


This is valuable information for educators working to ensure that all students achieve at high levels. The range is a quick calculation that can show you if your students are performing together as a group or if you have some students performing way above or way behind the rest of the pack. Generally speaking, when looking at educational data, we want to see low ranges. That indicates that our students are moving together as a group.


Standard Deviation

The standard deviation is a more rigorous measure of dispersion as it applies some heavy math to the situation. A low standard deviation indicates that most of the scores are clustered around the mean