High School: Statistics and Probability
High School: Statistics and Probability
Interpreting Categorical and Quantitative Data HSS-ID.A.1
1. Represent data with plots on the real number line (dot plots, histograms, and box plots).
Statistics is all about data. Collecting, then analyzing, then making guesses based on previously collected data, then comparing to see if the predictions were accurate. Then collecting more data and analyzing it some more. Evidently, a statistician's work is never done.
Fortunately (or unfortunately) for your students, statisticians have come up with many different ways to represent this data. That way, they don't have to look at and try to make sense of an endless and ever-growing table of numbers.
Students should be comfortable with representing data on the real number line in the forms of dot plots, histograms, and box plots. Quite obviously, that means they should know the difference between them.
A dot plot is a diagram that represents a data set using dots over the number line. A histogram is a diagram that shows a data set as a series of rectangles that shows how often data occur within a given interval. A box plot, also called a box and whisker plot, is a diagram that shows a data set as a distribution along the number line, divided into four equal parts using the median (the middle data value) and the upper and lower quartiles (median of upper and lower half of data, respectively).
But why yammer on about these different plots when we can show you exactly what we mean? The following table shows how fast Michael Phelps, one of the world's greatest Olympic swimmers, can swim the 200-meter freestyle event (rounded to the nearest second).
103 | 105 | 103 | 103 | 103 | 105 |
106 | 108 | 106 | 106 | 108 | 107 |
Students should know that to create a dot plot of Michael Phelps 200-meter freestyle times, they should focus on the portion of the number line that covers the data points. Looking at the data given above, we need to include numbers from 100 to 110.
Now, all we need to do is place a dot on the appropriate number for each data point with that number. For instance, since only one of his times was 107 seconds, we place only one dot on the number line at 107. Since 108 seconds occurs twice in our data table, we place two points, one on top of the other, on the number line at 108. Eventually, our dot plot should look something like this.
To create a histogram of the Michael Phelps data, students should create a chart with the time on the x-axis (horizontal axis) and count (or frequency) on the y-axis (vertical axis). A rectangle is drawn the width of each interval with a height equal to the count for that time. For example, drawing the rectangle for 103 seconds yields the following:
Now we can complete the histogram for the rest of the data.
One important feature of a histogram is that the rectangles don't care much for personal space. They're touching because they represent intervals rather than specific numbers. After all, time is continuous, right? For this reason, histograms are particularly useful for large ranges of data.
One last way students can visually represent Michael Phelps's 200-meter freestyle times is using a box plot. This type of plot divides the data into four equal parts using quartiles (a value that divides the data set into groups with equal number of data points). In the case of the given data, 12 data points are provided so each quartile will contain 3 data points. To find the quartiles, its best to first sort the given data from smallest to largest. In the case of the data we have been working with, this yields:
103 | 103 | 103 | 103 | 105 | 105 | 106 | 106 | 106 | 107 | 108 | 108 |
Now, students should find the values for each of the three quartiles. In the case when there are an even number of data points, the value of the median is calculated as the average of the 2 middlemost numbers. For the above data, that yields 105.5.
To determine the lower quartile, we need to find the value that has 9 values above and 3 values below. In this case, the value will be 103. Similarly, the upper quartile is 106.5. Again these values are determined by taking the average value of the 3rd and 4th (for the lower quartile) and the 7th and 8th (for the upper quartile) values.
To begin to draw the box and plot diagram, draw the number line that covers the range of the values and draw a vertical line at the location of each quartile as shown:
If we connect these lines, we have our box.
The whisker part of the "box and whisker plot" comes in just after puberty. We're only joking. We can add in two more pieces of information: the minimum and maximum value. Draw one data point at the minimum value and another at the maximum value and create a whisker from the middle of the box out to this data point.
Now we have a box plot (and whisker) plot. Where else can inanimate objects have whiskers, except in statistics?
Drills
Aligned Resources
- Data Interpretation
- SAT Math 1.2 Geometry and Measurement
- SAT Math 6.2 Geometry and Measurement
- Histogramas
- Caja y Parcelas de Bigotes
- Box and Whisker Plots
- SAT Math 2.2 Statistics and Probability
- SAT Math 1.3 Geometry and Measurement
- SAT Math 2.3 Geometry and Measurement
- SAT Math 3.4 Statistics and Probability
- Scatter Plots and Equations of Lines
- Histograms
- SAT Math 4.1 Statistics and Probability