High School: Statistics and Probability
High School: Statistics and Probability
Interpreting Categorical and Quantitative Data S-ID.6c
c. Fit a linear function for a scatter plot that suggests a linear association.
Students should be able to use the Method of Least Squares to find the linear equation that best describes the relationship between two variables. Students should understand that this means minimizing the values of the residuals (because as the residuals decrease, the function gets closer and closer to the data points). But because residuals are both positive and negative, we square them and try to minimize the sum of the squares.
Since we want a linear relationship, the goal is to find the coefficients m and b in the linear equation y = mx + b. Students should already know that b is the y-intercept and m is the slope of the line. What they might not know is that they can be found using the following equations.
Let's return to the data one last time and see if we can derive the linear equation presented earlier. To do so, we will create a table that contains the following columns: xy, x2, and we will add a row that has the sums of each column.
So, to determine the slope, m:
m = 5.29
And to determine the constant, b:
The best fit linear relationship between height (x) and weight (y) is y = 5.29x – 219.
Students should be able to figure out if a data set should be fit with a linear function and how to do it. Maybe they can use this has their new party trick! Who wouldn't want to spend their Friday evening calculating functions, plotting residuals, and finding the best-fit linear equations?