An R package is a special set of tools that are designed to complete a certain task. An understanding of packages is essential for the rapid analysis of school data. Once you get a couple of packages under your belt, you will be able to quickly and repeatedly analyze your data for decision making. In this article I will explain how to install a package, load a package into your script, and discuss a few packages that I think you should prioritize. Installing a Package R is an open source programming language, meaning that anybody, anywhere, can create a package to perform a specified set of steps. There are hundreds of packages available to you. To keep R running smoothly, you have to install individual packages that you want to use. To do that, you need to drop the following line of code into your console. Model:
install.packages(“PACKAGE NAME HERE”)
Simply type the code, hit enter, and sit back and watch your console spazz out. When the new packages is installed, you will get a lovely message that says: package ‘ggplot2’ successfully unpacked and MD5 sums checked. It will also tell you where your new package is stored on your hard drive. Loading a Package R saves memory by only using packages that you have loaded up into your script. To load a package, drop in this line of code: Model:
library(“PACKAGE NAME HERE”)
Loading a package isn’t quite as satisfying as installing a package. Your console will simply blink and may display warning messages. If a package requires other packages to work properly, R will load those for you too.
Packages to Prioritize As I said earlier, there are hundreds of packages to choose from and you cannot possibly learn them all. Your goal should be to learn a handful of packages that you will use frequently and then you can teach yourself new packages as you need to learn new skills or deploy new tools. Here are some of the packages I think you should learn first: Base R – Okay, so this one really isn’t a package, but it is a natural starting point. Base R are all the functions built into the basic R package. It has many powerful tools to get you started with your data analysis. It also includes some helpful sample data sets for when you want to learn something new without messing around with your own data. Check out this article about Base R functions for education data.
readxl – readxl allows you to load data into R from Excel. This is the first package that educators should master. Why? Simple. Most of our data comes in Excel formats and you can’t do anything with R if you can’t get any data in there. dplyr – Your data doesn’t do you any good if the columns don’t make sense. The dplyr package helps you quickly clean up a data set without having to go back to your Excel document every time you find a tiny mistake. It can also help you create filters which make it very easy to compare the performance of student groups.
ggplot2 – This package will help you create AWESOME data visualizations for when you are called to share your impressive classroom data with the school board. There are some helpful visualization tools in Base R too, but ggplot2 really is the gold standard. googleVis – More and more schools are transitioning from Microsoft Tools to Google tools. While I still like ggplot2 better for creating data visualizations, googleVis is a useful tool for creating quick visualizations from your Google Sheet. Keep an eye on this space for future articles that explore how I actually use these packages – and many more – for educational data analysis tasks. In the meantime, hit the search engine and start learning! What other packages do you think are valuable to educators? Drop me a message below in the comments!