# Data Disaggregation Tool (DDT)

Updated: Nov 29, 2020

The Data Disaggregation Tool (DDT) is designed to allow teachers to easily and quickly analyze their student data. With just a few clicks, the DDT will automatically the data from a spreadsheet, disaggregate it by your selected student variable, calculate six descriptive statistics, and create a boxplot to allow you to quickly and easily compare performance between groups of students. When used regularly, this tool will help save time and promote the use of data to improve instruction. This article will provide detailed technical directions for using the tool.

**About the Tool**

The DDT is a __Shiny Web Application__ developed by Matthew Courtney in 2020. It uses the__ R statistical programming language__ to read data off of a spreadsheet and create a summary of any column. The DDT is hosted on the __shinyapps.io__ server.

**Preparing your Data**

Attention should be paid to the proper preparation of your data before uploading it to the DDT. The DDT will return accurate calculations and visualizations for whatever data you upload, but it cannot account for mistakes in your original data worksheet. If you are pulling your data from a standardized gradebook or testing system, it is likely ready to go with very little preparation.

When preparing your data, you should ensure that you follow the principles of tidy data. This means that each column contains a variable (like a test score) and each row contains an observation (like a student). You should also ensure that the columns you wish to examine contain numerical values. For example, a test score of eighty-nine percent should be recorded in the column as 89 or 0.89 and not 89%.

You should also ensure that each score in your column is formatted the same. If we take the previous test score example, you want to make sure that each test score is either a whole number, like 89, or a decimal point, like 0.89. The DDT cannot tell the difference between these variables and this will cause you to have incorrect outcomes.

Finally, you should make sure that each column has a header that is easily recognizable. You will need this to be able to accurately select the correct column in the DDT.

The DDT will examine whichever column you tell it to, so you do not have to remove columns with text or other variables to use the DDT. Just ensure that the column you want to examine is properly formatted. Having said that, you should NEVER upload personally identifiable information for yourself or your students to the internet. While the information uploaded into the DDT is not permanently stored, personally identifiable data is always vulnerable to cyber-attack. Do not upload personally identifiable data for either yourself or your students to the DDT.

When your data is clean and ready, save the file as a .CSV file. CSV stands for comma separated values. This is a common file format for transferring large amounts of data quickly and efficiently. The DDT will only read a .CSV file.

**Using the DDT**

Using the DDT to analyze your data is simple. First, upload your .CSV file by selecting the “Browse” button in the grey box. This will open a window that will allow you to find the file. Select the file and click “Open”.

The DDT will automatically upload your spreadsheet. This process is normally pretty quick, but the time it takes to upload will vary greatly depending on the size of your file. An upload progress bar will light up under the browse box.

When the DDT has completed its upload, you can generate the statistics by selecting the column with the student demographic information from the dropdown menu. Then, select the column with the outcome variable (such as a test score) that you want to examine. You can change which columns are being reviewed at any time by selecting new variables from the dropdown menu. The DDT will automatically update the statistics and graph for whichever variables you select.

All of the information presented by the DDT is static – meaning that you cannot change or customize it. You can, however, copy and paste the information into a document or slideshow presentation to easily share the result with your colleagues. You can also save the graph by right clicking on the graph and selecting “Save Image As” from the menu.

**Interpreting the Results **

The DDT will return six summary statistics and one visualization to help you interpret the meaning of your data set. While the DDT will quickly and accurately summarize your student data, it will not tell you what that data means. It is up to you to apply local context and your own background information about your students to derive meaning from the data. The DDT will present the following outputs:

Mean – The mean is the average of your distribution. It is a measure of central tendency that allows you to summarize a distribution.

Median – The median is the middle number in a distribution. When you compare it to the mean, the median can help you see if your data is skewed.

Mode – The mode is the number that shows up most often within a distribution.

Standard Deviation – The standard deviation is a measure that tells you how spread out your data is. The smaller the standard deviation, the closer together your students scored.

Minimum – The minimum is the lowest number in a distribution.

Maximum – The maximum is the highest number in a distribution.

Boxplot – A boxplot is a visualization that helps you see how your scores fell within the distribution. The bold line in the middle is the median. The top half of the box shows the quartile of scores above the median while the bottom half shows the scores below. The whiskers show you the highest and lowest scores. Any outlier scores are shown with little dots above or below the whiskers.