Microsoft Excel is the most popular spreadsheet available today and part of the reason for its popularity is the fact that Excel comes standard with hundreds of functions and formulas. These functions and formulas allow users to quickly and effectively manipulate and analyze data within the spreadsheet, without having to necessarily understand how the complex calculations that form the basis of these formulas work.
A good example of this is the correlation coefficient formula available in Excel. The following tutorial will show you how to use the correlation coefficient within your spreadsheet to work out the correlation between two sets of data. The correlation coefficient is based on the following equation:
But as mentioned above, as a user of Excel, you don’t have to know how the calculation works or even how to apply the calculation in order to be able to work out the correlation coefficient between two sets of data.
What you do need for this tutorial is a basic understanding of how to work with Excel. Excel for beginners will teach you all of the skills you need to understand and apply the lessons contained within this tutorial.
What the correlation coefficient is and how you can use it
The correlation coefficient is a mathematical representation of the mathematical relationship between two values or sets of data. It is the statistical measurement of degree to which the change in one of the measurements affects the change in another set of measurements.
To understand the value of the correlation coefficient, we will use a simple example. Let us suppose you are a teacher and you would like to measure whether your summer school program is effective or not. You could measure student’s results compared with the number of hours they have spent in summer school to see if there is a correlation between how many hours they spend in summer school and what their results are.
To understand the results of the correlation coefficient formula, you need to understand what the values mean. The results of the formula range between -1 and 1. The -1 result indicates a complete negative correlation between the data. In our example, if attending summer school meant that each student achieved a worse grade, then the correlation coefficient formula would give a result of minus one. If however there was no correlation between summer school and the grades, then the correlation coefficient would be zero. And if every student achieved better grades then the formula would show one. The closer to one the value is, the higher the degree of correlation there is between the sets of values. For an introduction to statistics, the introductory statistics course will teach you all about correlation and regression.
For this tutorial we will use the following set of fictitious data:
To see if there is a correlation between our students grades and summer school attendance we need to access the Formula in Excel.
There are two ways to access formulas in Excel. You can access formulas via the Formula tab or by typing the formula syntax within the cell. For this example we will use the Formula tab:
The correlation coefficient is part of the statistical formulas available in Excel. The advantage of using the Formulas tab to initiate the function is that the Function Arguments window opens up when you select a function, and the window helps you to enter the required parameters for the function.
The arguments required to work out the correlation coefficient are the two ranges of data that need to be compared. We will specify the grades range as our first parameter and the hours spent at summer school as our second range.
The result of the correlation coefficient formula on our set of data results in a value of 0.91. This is very close to one. So we can safely draw the conclusion that time spent in summer school definitely and positively affects a student’s grades.
Let us add another set of data to illustrate a negative correlation coefficient. Let us add a set of data that shows how many hours the students spent at the beach.
Instead of using the Formula tab and function arguments, we will add type in the correlation coefficient syntax to create the formula in our spreadsheet. To type in a formula in excel, select the cell where you want the result to appear. Type the equal sign to tell Excel you are going to be using a formula to calculate the value of the cell. To initiate the formula that calculates the correlation coefficient you need to type CORREL. Excel will automatically display the syntax of the statement to show you what parameters are required for the calculation.
This is what our statement will look like:
The above function calculates the correlation coefficient between students grades and time spent on the beach. The result of our correlation coefficient calculation is -0.95:
So we can determine from the data that the more time the students spent on the beach, the worse their grades were.
Why statistical analysis of data is essential
Statistical analysis of data is a vital tool for almost any industry. For businessman, for example, the correlation coefficient can be used to assess the success or failure of a particular advert or business strategy. Financial analysts use the correlation coefficient to track trends within the financial markets. The Advanced Excel for Professionals course for example will teach you how to use Excel for MIS reporting and data analysis. Researchers use the correlation coefficient to calculate and assess data they compile based on their research.
Thanks to the simplicity of Excel, calculating the correlation coefficient is really easy. Taking a beginners course on Excel can help you to unleash the power that Excel offers. A course like Mastering Excel will teach you all the basics you need to get started with Excel, as well as teaching you more advanced functions that are contained within the application.