Updated: Jan 7
In the wake of COVID-19, educators across the nation are turning to data analysis tools that use algorithms to prioritize students for services and predict student outcomes. While tools that are heavily studied and vetted can help teachers direct time and resources to students that need them the most, whenever we turn decision making processes over to algorithms, we introduce algorithmic bias into the mix. In this post, I want to answer some common questions about algorithmic bias and direct you to some helpful resources.
What is an algorithm?
In its simplest form, an algorithm is a set of rules that are used to solve a problem. People often think of algorithms as a new thing, but the earliest algorithms come from the ancient Middle East where the Babylonian people used a series of if-then math calculations to compute the time and place of important astronomical events. As time progressed, hand calculated algorithms would continue to be used to calculate a wide variety of tasks, including NASA’s initial trips to space!
In today’s data rich environment, algorithms run our world. They decide how much your plane ticket costs, which advertisements you see on your social media feed, and which shows you watch on your favorite streaming service. Most people would say that the rise of algorithmic decision making has improved their quality of life and it is virtually impossible to live your life without algorithmic influence.
How are algorithms used in education?
Algorithms are already commonplace in education. We commonly think of algorithms when we think of standardized testing. Algorithms are used to help schools and districts process test scores and identify students at risk of falling behind. Risk prediction algorithms can also look at non-academic data, such as attendance and behavioral data, to help identify which students need extra support.
Algorithms can also be found embedded within many modern standardized tests themselves. Computer adaptive tests, for example, run algorithms in the background of the test and update test question difficulty in real time as students get answers right or wrong. Many technological intervention tools use algorithms in a similar way – by assessing students along the way and selecting games and practice activities for students based on their performance.
What is algorithmic bias?
Many people think that computerized data analysis eliminates bias by taking the personal side out of the decision making process. Unfortunately, that isn’t always the case. Algorithmic bias occurs when repeated and systematic errors result in an unfair outcome. Remember that algorithms are written by human beings, and therefore can reflect the biases of those who develop them. Like most forms of bias, algorithmic bias is rarely intentional, but it can create real harm, nevertheless.
Where does algorithmic bias come from?
Algorithmic bias can be introduced into an algorithm in a few ways. First, algorithms can reflect the existing biases within an organization or society. This type of bias is often difficult to see – it is the kind of bias that exists because “that’s just how things are here”. This type of bias is common when algorithms are designed to replicate existing decision making processes because the results of the algorithm are designed to match the results of the manual process – and they are tested against prior results. The algorithm is deemed fit if the outcomes match outcomes from the past. In this instance, the bias is baked in from the beginning.
Another way that bias is introduced into an algorithm is by what is called its “training data”. When computers run algorithmic programs, developers “teach” the computer about the data using a pre-existing data set. The idea is that the computer will be able to recognize patterns in an existing data set and then be able to spot those patterns in future data. The problem is that if bias exists in the training data – such as a data set that does not include data reflective of minority populations – that bias will eventually be amplified as the computer systems run the algorithms in the future.
Finally, algorithmic bias can make its way into a system when an algorithm is being used in a new context for the first time. When there isn’t a history of using an algorithm for a given task, then there is less information available to the people monitoring the process to be able to tell if the algorithm is off-base or not.
What is an algorithmic “black box”?
Many times, algorithms are referred to as “black boxes”. This term reflects the mysterious nature of algorithms that are developed using machine learning techniques (this is when the developers “teach” the computer to find patterns based on an existing data set). The concern with black box algorithms is that developers don’t really know all of the steps the computer is taking when it begins sorting the data. Data goes in – answers come out – and the process remains a mystery.
Has algorithmic bias caused issues in the past?
One of the most commonly told stories of algorithmic bias is the story of Microsoft’s chatbot “Tay”. Tay used algorithms to “learn” to speak like a millennial by reading tweets and interacting directly with real people. Within 24 hours of releasing the chatbot into the world, it found its way to the dark side of the internet and became incredibly racist – even declaring that “Hitler was right” – before being taken down by the development team.
While the story of Tay is a cautionary tale – people have experienced real harm as the result of algorithmic bias. In 2018, Amazon ended an algorithmic recruitment program due to an algorithmic bias against women. The problem is that tech is a male dominated field, and when the algorithm started reviewing resumes, it was picking up on masculine language and traits and giving priority to those elements. This was not an intentional decision on the part of Amazon, it was a bias introduced by the existing data when it entered the algorithms black box.
Another example comes to us from the medical field, in which healthcare algorithms have shown to prioritize white, upper income patients for care. In this instance, the societal biases that prevent those from minority backgrounds from accessing healthcare created a bias in the training data that was later reflected in decision making.
How can I protect our students from the effects of algorithmic bias?
Algorithms aren’t inherently bad. They can help us help kids, but only if we know how to use them. The first way you can protect students is by heavily vetting algorithmic programs before you purchase them. Any vendor pushing an algorithmic program should be able to show you rigorous, independently produced research that shows its efficacy. They should also be able to show you the results of bias studies designed to sniff out algorithmic bias. Don’t buy a program from a vendor who cant offer you these documents.
Once you have acquired a new program, you should continue to monitor for bias. When you are looking at algorithmically produced data – such as risk assessment data – take a moment to look beyond the numbers at the kids. Remember my cardinal rule – when we turn kids into numbers, we have to remember to turn the numbers back into kids before we make a decision. Scan through the list produced by your algorithm and check for commonalities. Is your algorithm accurately predicting at-risk students, or is it tapping into a non-academic datapoint, such as poverty or skin-color, to draw conclusions?
Finally, run a gut-check. Teachers know their students better than any algorithm ever could. If you are looking at an algorithmically produced list and something feels off about it, trust your instinct and take a deeper look. Ask your colleagues how they feel about the data. Find an external partner, such as a University partner, to help you look more closely at the data and check for bias. Trust your professional gut and be hyper-vigilant.
Where can I find more resources about algorithmic bias?
There are tons of resources online to help you learn more about algorithmic bias. I recommend you check out the work of the Algorithmic Justice League. They have published a variety of academic studies regarding algorithmic bias across many sectors. Don’t miss the amazing TED Talk by Joy Buolamwini.
I also recommend reviewing the resources from The Center for Applied Artificial Intelligence at The University of Chicago Booth School of Business. They curate resources for both developers and users of algorithms which includes a four step playbook to help you avoid algorithmic bias.
Finally, check out the Algorithmic Safety initiative from Data & Trust Alliance. They have assembled a panel of more than 200 experts in the field to help develop best practices and safeguards for those who are implementing algorithmic decision making tools.