Restriction of Range
Brief description and instructions (DRAFT):
Measuring Correlations can be tricky. There are several issues that can make a correlation look weaker or stronger than they really are. One issue is that one variable or the other is sampled over two narrow of a range. This restriction of range, as it is called, makes the relationship seem weaker than it is. In this applet you will get the opportunity to set up a correlation and then select a subset of it and see how the measured relationship changes.
Using the illustration:
The main part of the screen shows a graph. In this case, this is a scatter plot where each (x,y) pair is plotted without any connecting line. The first slider to the right of the graph (the r slider) will allow you to set the graph to any level of r you wish from -1 to 1 and randomly select a sample that has that correlation and plot it. Restriction of range is more of a problem for stronger correlations show it is best to generally choose stronger relationships. The n slider sets the size of the sample. Since you will be selecting a subset, a higher number of samples will help this illustration work better so that you still have a good sample size in your restricted set of data. You can have the correlation and the point that is the mean of both X and Y shown with the checkboxes in the right corner of the screen.
To select a subrange of data to correlate, place your mouse over the graph, click and drag your mouse across the graph. The graph will highlight the region you have selected by drawing a light gray overlay covering your selected area. When you have selected a region, the Calculate r for selection button will become active. Press this button to get the r for this region and text giving you the r for the overall data set right below the button and you can compare the two outcomes.
Click here to open the applet. It will open a new window that will fill your screen.