Standard Deviation
Consider the following data about the heights of plants in Jonathan's garden:
3cm, 4cm, 5cm, 7cm, 11cm
Now, let's calculate the mean - μ - of these values.
μ = (3 + 4 + 5 + 7 + 11)/5 = 6cm
If we use this value to describe the mean height of plants,
we immediately run into difficulties; because, it does not represent the true nature of heights of these plants - some are as
short as 3 cm and some are as tall as 11 cm.
Therefore, the mean in this case, to say the least, is a bit misleading. This leads to a need of another value that helps us to understand the distribution
of data in a given situation.
Now let's see how much each value of data has deviated ( going away ) from the
mean:
x | 3 | 4 | 5 | 7 | 11 |
μ | 6 | 6 | 6 | 6 | 6 |
(x - μ) | -3 | -2 | -1 | 1 | 5 |
Let's find the average of these deviations from the mean value:
Σ(x - μ) / 5 = (-3 + -2 + -1 + 1 + 5 )/5 = 0
The deviations turned out to be zero, not because of lack of deviations; it is because, the deviations turned out to be
negative and positive which in the end led to be cancelled out.
Now, in order to deal with issue, let's square the deviations to remove the negative signs, which is as follows:
x | 3 | 4 | 5 | 7 | 11 |
μ | 6 | 6 | 6 | 6 | 6 |
(x - μ) | -3 | -2 | -1 | 1 | 5 |
(x - μ)2 | 9 | 4 | 1 | 1 | 25 |
Since we squared the deviations, just to deal with negative values, it's time we
reversed the process: let's find the square root of the following result:
√(Σ(x - μ)2)/5 = √(40/5) = 2.8
This is called the standard deviation of the above set of
data representing the heights of plants in Jonathan's garden. It gives us a clearer picture of data
distribution along with the mean. With the value of the standard deviation, the data can be described
in the following way:
The mean height of the plants in Jonathan's garden is 6cm and the standard deviation is 2. 8. That means the heights of most plants falls into the range from (6-2.8) = 3.2cm to (6+2.8)=8.8cm.
The example shows how important the Standard deviation is to get a clear picture about a set of data. Without it, talking about data is like, recalling the
fate of Titanic without the iceberg!!
So, the formula for standard deviation is as follows:
σ = √Σ(x - μ)2/n
where n is the total frequency.
Calculator-friendly formula for Standard Deviation
σ = √(Σ(x - μ)2)/n
σ = √(Σ(x2 - 2xμ + μ2)/n
σ = √(Σ(x2 - Σ2xμ + Σμ2)/n
σ = √(Σ(x2 - 2μΣx + Σμ2)/n
σ = √(Σ(x2 - 2μnμ + nμ2)/n
σ = √(Σ(x2 - 2nμ2 + nμ2)/n
σ = √(Σ(x2 - nμ2)/n
σ = √(Σx2/n) - μ2
σ = √(Σx2/n) - μ2
To find the standard deviation in grouped data, we change the method
slightly - σ = √(Σf(x - μ)2)/n, where f is the frequency of each
class and n is the total frequency.
E.g.
The frequency of shoe sizes of students in a certain class is as follows:
shoe-size(x) | frequency(f) |
3 | 3 |
4 | 5 |
5 | 10 |
6 | 8 |
7 | 4 |
μ = Σfx/n = 5.2
σ = √(Σf(x - μ)2)/n = √(Σf(x - 5.2;)2)/30 = 2.3
E.g.
The marks obtained by a group of students for maths are as follows:
Marks(x) | frequency(f) |
0 - 20 | 3 |
21 - 40 | 6 |
41 - 60 | 9 |
61 - 80 | 8 |
81 - 100 | 4 |
μ = Σfx/n = 52.7
σ = √(Σf(x - μ)2)/n = √(Σf(x - 52.7)2)/30 = 2.55
I am sure, you have got a good understanding of the concept of standard deviation by now.
Now, in order to complement what you have just learnt, work out the following questions:
- The time taken by 10 engineers to install a satellite dish, in minutes, is as follows:
51, 49, 56, 60, 52, 58, 49, 56, 52, 57
Find the mean and the standard deviation.
- When a die is thrown, the numbers turn out as follows:
Number | frequency (f) |
1 | 3 |
2 | 7 |
3 | 10 |
4 | 14 |
5 | 8 |
6 | 2 |
Find the mean and the standard deviation.
- The weights of some chicks obtained by a farmer are as follows:
weight (x) | frequency (f) |
0 - 20 | 7 |
21 - 40 | 11 |
41 - 60 | 3 |
61 - 80 | 7 |
81 - 100 | 2 |
Find the mean and the standard deviation.
- The standard deviation of a certain set of data is 4.2. What would be the next standard deviation, if each data was increased by 5 ?
- The standard deviation of a certain set of data is 4.2. What would be the next standard deviation, if each data was
multiplied by 5 ?
Learn how to use Casio Calculator for Statistics