Correlation is a statistical procedure designed to measure the strength and direction of the linear relation between two variables. As values on one variable increase, do values on the second variable tend to also go up? Go down? Do nothing? On this page, we will be working on how to plot correlational data. On the next page, we will discuss how to use statistics to assess correlation. To do this activity, we will learn a new feature of Jamovi. There are plug ins you can add, called Modules. Click on Modules and then Jamovi LIbrary that shows up when Modules is clicked.
In the window that pops up, scroll down the modules and select SCATR and press the Install button right below it.
Data for Example: Chivalry Questionnaire
For this exercise, you’ll be working with some questionnaire data from 411 students. You can access the data in the following Excel file:
The questionnaire included five measures of gender role attitudes:
The plot above is called a scatterplot. Each point in that plot refers to two measures of a single observation. In this case, each point refers to the AWS and MVIRT scores of a single person. As you can see, low scores on AWS tend to be found with low scores on MVIRT, while high scores on AWS tend to be found with high scores on MVIRT.
To get this scatterplot, select Exploration → Scatter Plot. In the scatter plot dialog, put AWS on the x-axis and MVIRT on the y-axis. You should get this:
In the first scatterplot (with the white background), there is a solid line with a blue area around it going through the data. The center line is the line of best fit, a line that minimizes the vertical distances between the data points and the line itself. The best fit line is a useful way of representing the linear trend in a scatterplot. It helps capture the pattern indicated by r = +0.5: higher scores on AWS are found with higher scores on MVIRT. The blue area around the center line represent the upper and lower bounds of the 95% confidence interval around the line of best fit. You can have 95% confidence that the line of best fit for the population (the true best fit line) is within the confidence interval. Note that the confidence interval is not reflective of where 95% of the data points are. It corresponds to where the line of best fit would be drawn, not to where data are likely to appear.
To add a best-fit line to your plot, below the box with the variables under Regression Line, select Linear and Standard Error as seen below.:
You should now have the graph at the top of the page.