Saturday, October 5, 2013

Least Squares Regression Lines and the Correlation Coefficient, r

Bivariate, quantitative data is displayed in a scatterplot. Scatterplots show us the extent to which the random variables x and y are related (or not related) to one another. We typically display this relationship through a least squares regression line.

The least squares regression line comes from the equation of a line from algebra 1: instead of y = mx + b, where m is the slope and b is the y-intercept, AP stats likes to use y = b1x + b0, where b1 is the slope and b0 is the y-intercept (this is because when you have more than 1 variable, they start labeling the coefficients b2x, b3x, b4x....etc. so that they can keep track of how many variables are in the regression equation).

b1 is interpreted as: "the change in the (y-variable) is (slope) given a one-unit (whatever x is measured in) change in (x variable)."
b0 is interpreted as: "the amount of (y-variable) is (y-intercept) whenever the (x-variable) is zero."

The correlation coefficient r tells us the extent to which x and y are linearly related to one another. If the absolute value of r is high, there is a strong relationship, and conversely, if the absolute value of r is low, there is a weak relationship.

No comments:

Post a Comment