top of page

Technical Directions

Correlation Matrix Generator

A correlation matrix is a quick and easy way to identify the strength of relationships between two variables, such as the relationship between a demographic group and a student outcome. This correlation matrix generator is an incredibly easy and efficient way for teachers to gain better insight into their data. With just a few clicks, the generator will prepare a color-coded visualization that you can use to jump-start your data conversations. This article will provide detailed technical directions for using the tool.


About the Tool

The DAT is a Shiny Web Application developed by Matthew Courtney in 2020. It uses the R statistical programming language to read data off of a spreadsheet and create a summary of any column. The DAT is hosted on the server.


Preparing your Data

Attention should be paid to the proper preparation of your data before uploading it to the generator. The generator will return accurate calculations and visualizations for whatever data you upload, but it cannot account for mistakes in your original data worksheet. If you are pulling your data from a standardized gradebook or testing system, it is likely ready to go with very little preparation.


When preparing your data, you should ensure that you follow the principles of tidy data. This means that each column contains a variable (like a test score) and each row contains an observation (like a student). You should also ensure that the columns you wish to examine contain numerical values. For example, a test score of eighty-nine percent should be recorded in the column as 89 or 0.89 and not 89%.


You should also ensure that each score in your column is formatted the same. If we take the previous test score example, you want to make sure that each test score is either a whole number, like 89, or a decimal point, like 0.89. The generator cannot tell the difference between these variables and this will cause you to have incorrect outcomes.


Finally, you should make sure that each column has a header that is easily recognizable.


Remember, you should NEVER upload personally identifiable information for yourself or your students to the internet. While the information uploaded into the generator is not permanently stored, personally identifiable data is always vulnerable to cyber-attack. Do not upload personally identifiable data for either yourself or your students to the generator.


When your data is clean and ready, save the file as a .CSV file. CSV stands for comma separated values. This is a common file format for transferring large amounts of data quickly and efficiently. The generator will only read a .CSV file.


Using the Correlation Matrix Generator

Using the generator to analyze your data is simple. First, upload your .CSV file by selecting the “Browse” button in the grey box. This will open a window that will allow you to find the file. Select the file and click “Open”.


The DAT will automatically upload your spreadsheet. This process is normally pretty quick, but the time it takes to upload will vary greatly depending on the size of your file. An upload progress bar will light up under the browse box.


When the generator has completed its upload, it will automatically display a correlation matrix in the white space.


The graphic created by the generator is static – meaning that you cannot change or customize it. Any changes you wish to make should be made in your original data spreadsheet. You can, however, copy and paste the information into a document or slideshow presentation to easily share the result with your colleagues. You can also save the graphs by right clicking on the graph and selecting “Save Image As” from the menu.


Interpreting the Results

The generator will create a single correlation matrix to compare the variables in your spreadsheet. For each pairing, the generator will present a correlation score on a scale of -1 to +1. The closer to either figure the score is, the stronger the relationship between those two variables. The matrix will also be color coded, with darker shades of blue aligned to negative correlations and darker shades of red aligned to positive correlations.

bottom of page