CCSS Edition Parent Resource Core-Plus Mathematics
 Mathematical Content Curriculum Overview Sequence of Units CCSS Alignment CPMP Classrooms Helping Your Student Helping with Homework Preparing for Tests Preparing for College Research Base Design Principles Research on Learning Research on Communication Evidence of Success Key Evaluation Findings

# Course 2, Unit 4 - Regression and Correlation

Overview
In the Regression and Correlation unit, students learn to describe the association between two variables by interpreting a scatterplot, to interpret a correlation coefficient, and to understand the limitations of correlation coefficients. They learn to know when it is appropriate to make predictions from the regression line, and to understand the effects of outliers and influential points on the correlation coefficient and on the regression line. Students compute and interpret Pearson's correlation and come to understand that a strong correlation does not imply that one variable causes the other.

Key Ideas from Course 2, Unit 4

• Least squares regression line: a line that fits a set of data with slope and intercept chosen to minimize the sum of the squared errors (SSE), also called the squared residuals. (See pages 280-285.)

• Residual: The difference between the observed value (the value used in calculating the regression equation) and the predicted value is called the residual. These differences are visualized as the vertical gaps between given data points and the regression line. (See page 283.)

• Pearson's correlation r: a single number that gives a measure of the strength of the linear association between variables. Perfectly linear data will have an r value of 1 (for a positive association), or an r value of -1 (for a negative association). A set of data that is quite linear, and for which the y values increase as the x values increase, is said to have a strong positive association. If the points appear to fit a linear trend but the y values decrease as x increases, the association is negative. A set of data that is quite random, not at all linear, will have an r value close to zero. (See pages 258-268, 291-298.)

• Association and causation: Two variables may have a strong correlation without there being any reason to suspect that when one variable changes, it "causes" a change in the other variable. The example used in the text to show this idea is data on ice cream consumption and recorded crimes in 12 countries. (See pages 299-303.)