|
|
|
Step-By-Step Guide
Resources
Step-By-Step Guide
Stats Task 2.2: Variability and Standard Deviation
For this task, you need to respond to Roger Snow’s request to understand what standard deviation will tell him about the variability of BMI in his sample.
The steps below will help you do your work.
- Before you begin your work, find a peer to work with for this task. Although each of you will be responsible for submitting your own work, you may work with one another to better analyze the question and come up with the important components of your response. You may also partner up with more than one person or as otherwise advised by your mentor.
- Review the email and make sure you understand what question(s) you are being asked to answer in this task.
- Review the Resources (above) available for this task, to do additional research on standard deviation, and some of the other statistical terms you will need to be familiar with to answer the question.
Standard Deviation
Open Household Income for Rose Park Example 2. This is the same working example used in the previous task, but now includes the calculation for standard deviation. You and your peer will use this file to review the key statistical concepts related to this task. This example is presented as three worksheets: the first sheet is the raw data, the second sheet is the bar chart, and the third is provided for workspace (use described below).
- Determine what the standard deviation tells you about the variability of the data in the Household Income for Rose Park Example 2 so you can tell Roger what it means for his study. The purpose in looking at this example is so you have an opportunity to play around with real data in a sample, which will hopefully make it easier for you to explain to Roger what the standard deviation tells Roger about the variability of his sample. Your goal in the next few steps is to become more familiar with the concept of standard deviation and why it is an important statistic to use to summarize the data in a study.
Tip: It may be helpful to save two copies of this file (e.g., one called “Modified Rose Park”) so as changes are made to the data to test the different concepts, comparisons to the original version can still be made.
- Print a copy of the second sheet (graph). You will use this copy when working with the Example data and will refer to it often. Alternatively, you can "Copy" the graph and paste a picture of the graph (in Excel right-click and select "Paste Special" then select the"Picture" option) in the third worksheet. "Workspace". The picture of the graph using the "Paste Special" feature won't change as you change the data in the first worksheet. As you make changes to the data to see how the mean, standard deviation, and the graph change, you can then compare multiple graphs within Excel without printing. You will need to note the Mean and Standard Deviation next to the respective graph.
- Look at the first sheet ("Data and Statistics"):
- As in the previous task, in the upper right corner of the spreadsheet, you will see the statistical calculations for this sample. Remember these numbers are “live” and will change as you change the data points within each cell.
- Write down the values for the mean and the standard deviation. You will refer back these two numbers frequently during this task. If you would like more help in understanding the terms standard deviation or mean, refer back to the Resources section.
- Look at the second sheet (graph):
- Remember this graph is a visual representation of the data, which in this case is household income. It shows you how many households have incomes within each income group (grouped in increments of $5,000). This shows you where the data fall (distribution) and how close or far apart the data are (variability).
- Determine what the standard deviation (SD) tells you about your sample. SD is one of the more complex statistical concepts, and it often easier to understand when looking at the data visually. The purpose of this mini-task is to give you an opportunity to play with the data in the Example so that the numbers become more meaningful and intuitive. For example, you know that the SD is a little more than $26,000 and the mean is about $58,000, but what does that really tell you? Use the graph you printed and a marker to draw on the graph as you go through these steps.
- Find the mean on the graph by looking along the x-axis for $58,000 and with your marker, make a star at the point where the mean falls on the x-axis. This is a very important data point because the standard deviation is always described in relation to the mean. The SD describes how spread out or close together the data are around the mean. Therefore, the SD is simply not meaningful if stated without the mean. Spend a few moments just looking at the data in relation to the mean.
Tip: If using pictures of the graphs within Excel, utilizing other features such as inserting shapes and placing them on the picuture can mark the graph. If you're not able to get the shapes to work it might be easiest to print the graphs and draw on them manually.
- Where do most of the data points fall?
- Are they fairly spread out or are most pretty close to the mean?
- Next, look at all of the households that fall within one standard deviation of the mean (meaning one standard deviation below the mean and one standard deviation above the mean). To do this:
- First, subtract the standard deviation from the mean ($58,000-$26,000= $32,000). Find this value (32,000) on the x-axis. Draw a red vertical line from the top of the graph down through this value on the x-axis.
- Then, add the standard deviation and the mean ($26,000 + $58,000 = $84,000). Now find this value (84,000) on the x-axis. Draw a vertical line from the top of the graph down through this value on the x-axis.
- Estimate what percentage of households have incomes between these two line, which is one standard deviation above and below the mean. To do this:
- Estimate the number of data points between the two lines you have just drawn on the graph.
- Now calculate your guess as a percentage of the total number of households in the study (100 data points total).
- You have now estimated the percentage of data that fall within one standard deviation of the mean for this sample.
- Go back to the first sheet ("Data and Statistics") and change the income data to try to increase the standard deviation by at least $4,000.
- What did you do to make the SD larger?
- As you make these changes to the data, notice how the statistical calculations in the upper right corner change as a result of your changes.
- Also, look at what happens to the graph as you increase the SD.
- Using your new graph and the revised data, calculate where one standard deviation falls above the mean (mean – SD) and where one standard deviation falls above the mean (mean + SD).
- Mark this on your graph.
- Now estimate how much of your data fall within these two values.
- What happens if you go out two standard deviations from the mean? How much of your data fall within 2 SDs?
- Review the Resources section to find out more about normal distributions and the theory behind the percentage of data that fall within the first, second and third SD of a normal distribution.
- If you know your sample is normally distributed, and you know the mean and standard deviation, you can guess the % of data that will fall within each SD from the mean.
- There are more advanced statistical theories regarding how much data is contained within one, two, and three standard deviations when the distribution is skewed and how to deal with it. For the purposes of this rotation, focus on standard deviations with normal distributions.
- Think about what it means to increase or decrease the SD of any sample.
- Remember that for any set of data, the smaller the standard deviation is, the closer the data are to the mean.
- And, if you have a large standard deviation, the data are considered to be more variable or spread out.
Drafting the Email Response
- This time without your peer, draft an email response answering Rogers’s questions standard deviation.
Using the mini-tasks you have just completed, draft an email response explaining what the standard deviation tells Roger about the variability of his sample.
- In your response to Roger, use a similar example to illustrate the meaning of standard deviation. You can use use ideas from the Household Income for Rose Park Example 2 provided here, or another example you create or find.
- Next, conduct a review of your responses by exchanging responses with your peer.
- Review one another’s email response to determine if the email meets the requirements listed below. If an email response does not meet all of the requirements, help one another identify how it can be modified to be more clear and/or complete.
- Is the email response clear and concise?
- Does it answer the question – “what does the standard deviation mean in terms of my study?”
- Does the response include an example that illustrates the meaning of standard deviation?
- Each of your responses should be unique, reflecting your own work and thinking.
- Submit your individual response to your mentor. Review the checklist located in the Submit Your Work section of this task before submitting your response to your mentor.
Resources
Stats Task 2.2 Resources
The resources below will help you get started on this task. You may decide to do additional research to help clarify concepts, or to gain a deeper understanding of the subject matter. View the General Skills Resources link on the left for more information on research including evaluating web resources.
While Wikipedia is a valuable resource, unlike some other websites anyone can contribute to or modify the site (whether they're knowledgeable about the topic or not). As a result, the site is subject to constant change by questionable sources. Be sure to cross-check information on Wikipedia with other reputable sites to ensure accuracy.
Definition of Variance
A resource from Wikipedia describing variance and different factors that affect variance. Note: Some sections of this resource (e.g., Properties, introduction) are easier to understand than other parts. Use the information you need to complete the task at hand.
Normal Distributions and Standard Deviation
Standard Deviation and Normal Distributions
A blog that describes both standard deviation and normal distribution in non-math language. Make sure and review the example about the elementary school test scores to understand implications of standard deviation.
Normal Distributions based on Standard Deviation
Examples of normal distribution curves based on different standard deviations. Includes a simple description of how the SD is calculated. Note: the bottom half the the page is related more to the mathematical calculation of the SD.
Normal Distribution Applet
This page has a definition of a normal distribution but also includes an applet to allow you to manipulate a normal distribution by changing the mean and standard deviation.
Definition of Standard Deviation
A resource form Wikipedia on SD. Note: The introduction and "Real-life Examples" section provide descriptions and not mathematical formulas.
|
|
|
|