Pearson Product-Moment Correlation: A Relationship Measurement Tool
This correlation is the most popular of all correlation measurement tools. It’s known as the Pearson Product-Moment Correlation coefficient, the Pearson correlation coefficient, or most notably, “the correlation coefficient”. It’s often used to decipher trends in economics and business sectors, however once you learn it, you can apply it to any quantifiable data you may have. If you suspect a relationship between two variables, this is the tool you want to use to better understand it. If everything statistics related freaks you out, I recommend checking out this Introduction to Statistics course before getting started.
What is the Pearson product-Moment Correlation?
In statistics, it’s a measuring tool to determine whether there is a linear relationship between two variables – or not. It quantifies the strength and the direction of the relationship which can be identified by the correlation coefficient. A correlation exists when two variables are measured and when there is a change in one, there is a change in another, whether it’s in the same or opposite direction. There are other correlation measurement tools like Spearman’s rank correlation, or Kendall’s rank correlation, but those measure different types of associations and aren’t alternatives to using the Pearson Correlation Coefficient model.
If you were to use the Pearson correlation measurement as an equation it can get pretty complicated. In definition the Pearson Product-Moment Correlation is the covariance of two variables divided by the product of their standard deviations. The equation looks like this:
Instead of doing a bunch of math, we’ll use Excel to measure the coefficient below.
What’s a correlation coefficient?
The correlation coefficient is just a number that represents the strength and direction of the relationship between two variables, typically your independent and dependent variables. This number must fall between -1.0 and +1.0 and can be calculated using data analysis software, we’ll go over that a little later. The correlation coefficient for the Pearson Product-Moment Correlation is typically represented by the letter R. So you might end up with something like r = .19, or r = -.78 after entering your data into a program like Excel to calculate the correlation. The correlation coefficient determines whether the linear relationship between two variables is positive or negative and weak or strong, or non-existent.
A positive correlation will result in an r value of 0 to +1.0. A negative correlation will have an r value of 0 to -1.0. If you’re correlation coefficient is 0, this means there is no relationship between your variables. You can have a weak positive, weak negative, strong positive or strong negative correlation in addition to a perfect positive (+1.0) or a perfect negative (-1.0). See the chart below for how to classify your correlation coefficient.
Value of r
Strength of relationship
-1.0 to -0.5 or 1.0 to 0.5
-0.5 to -0.1 or 0.1 to 0.5
-0.1 to 0.1
None or really weak
Calculating the Correlation Coefficient
So now you know what the Pearson’s Product-Moment Correlation is and how to read your results. Let’s get your data analyzed so we can see how strong of a relationship exists between your variables. If you’re new to Excel, get the basics under your belt in Excel 2013.
Open Excel and create a new spreadsheet.
Add your data to the Excel spreadsheet by putting your independent variable data in column A, and your dependent variable data in column B. You can title these columns anything you want. If you’re using titles, you will enter your data in A2, B2, C2, etc.
Choose a cell where you would like your correlation coefficient to go, I chose C24, because it seemed more organized than choosing any other cell.
Now we’ll enter the correlation function which can be done two different ways. First, I’ll tell you the harder way only because it’s good to know how to search for a function you want when you don’t know the formula. The second way is a simple formula.
To search for a function you want first click on the cell (C24) and then go to the top of your screen and click on Formulas. You should see a ribbon of options where you can select Insert Function.
In the “search for a function” box you want to type CORREL. You can also find this function by going to the Statistical category and then click CORREL. Click OK.
You will be prompted to enter information for array 1 and array 2. Array 1 is going to be the range of cells for your first column. The array 1 input for this would be A2:A23. In the Array 2 box you will type in the range of cells for your other set of data, B2:B23. Click OK.
When you click OK, you should see the correlation coefficient appear in the cell you selected. This is the R value.
An Easier Way
Instead of going through the Insert Function process, you can just type the formula into the cell you want. The formula is:
=CORREL(A2:A23,B2:B23) It’s good to note that all formulas must start with an = sign.
So, now you see that the correlation coefficient for the data on this spreadsheet is R= -0.4. Since this is a negative coefficient and it’s closer to 0 than -1.0, we can safely say that the relationship between these two variables is a weak negative correlation. Formulas like the CORREL one are a dime a dozen. If you’re curious about more possible functions in Excel spend some time understanding statistical Excel functions.
Graphing the correlation coefficient
Now that you know how to get your correlation coefficient, or R value, you may be want to see the data showcased in a more visually appealing format, like a graph. Graphs are one of Excel’s best features. Let’s go over how to do this.
Assuming your data is already in your spreadsheet, highlight the cells that you wish to graph. You can do this by clicking and dragging, or individually selecting cells by holding down CTRL and clicking. If you want your column titles in the graph, make sure to select them as well.
Go to the top of your workbook and click on Insert. You will see a chart section pop up below. Click on recommended charts. Now select the scatter plot chart. A scatterplot is a type of graph that displays data to show you linear relationships, including outliers.
Once you’ve selected the scatter chart, hit OK. Your new chart should pop up on the Excel spreadsheet. It may not look like you want, and that’s okay – we can change that. If you click on the chart options at the top of your screen (under Design) you can change the color, layout, etc. To further edit the chart, click on the + sign next to the chart to change additional features like, labeling the axis’s, changing or getting rid of the title, adding a legend or a regression line, and more.
And, you’re done! Good work. Now you know how to: insert a function in Excel, write an Excel formula, calculate a Pearson’s correlation coefficient and graph your correlation data. You’re basically a pro.
Statistical Modeling students also learn
Empower your team. Lead the industry.
Get a subscription to a library of online courses and digital learning tools for your organization with Udemy for Business.